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REMARKS 

Claims 6-9, 26, 27 and 32-34 are pending in this application. Claim 34 is amended 
herein to clarify and more particularly define the invention. No new matter is added by this 
amendment. In light of the following amendments and remarks, applicants respectfully request 
reconsideration of this application and allowance of the claims to issue. 

I. Rejection under 35 U.S.C. § 112 

A. The Action states that claims 6-9, 26, 27 and 32-34 stand rejected under 35 U.S.C. § 
112, first paragraph, for allegedly failing to comply with the written description requirement. 

Claim 6 as presented herein encompasses a specific genus of nucleic acid sequences 
encoding peptides immunochemically reactive with antibodies to the Epstein Barr Virus (EBV) 
VCA-pl8 or VCA-p40 proteins, comprising an epitope of the VCA-pl8 or VCA-p40 protein, 
encoded within the EBV open reading frames BFRF3 and BdRFl, respectively and wherein said 
antibodies are antibodies having the same reactivity with VCA-pl8 as antibodies produced by the 
hybridomas deposited at the European Collection of Animal Cell Cultures under deposit numbers 
93020413 or 93020412 or antibodies having the same reactivity with VCA-p40 as antibodies 
produced by the hybridoma deposited at the European Collection of Animal Cell Cultures under 
deposit number 93020414. 

In claim 6, the nucleic acid sequences encode peptides comprising an epitope of the VCA- 
pl 8 or VCA-p40 protein. The peptides encoded by the nucleic acid sequences of claim 6 are further 
defined by their immunoreactivity with hybridoma-derived antibodies to VCA-pl8 and VCA-p40 
defined as European Collection of Animal Cell Cultures deposit numbers 93020413 or 93020412 
(VCA-pl8) or deposit number 93020414 (VCA-p40). As one of skill in the art would recognize, 
hybridoma derived antibodies are monoclonal. One of skill in the art further recognizes that 
monoclonal antibodies are specific to a single epitope and that all of the antibodies produced 
from a single hybridoma are identical. Therefore, contrary to the arguments in the Office Action, 
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these antibodies are not heterogeneous and, thus, are sufficient to define the genus of nucleic 
acid sequences encompassed by claim 6. 

Furthermore, the specification demonstrates actual reduction to practice of the nucleic acids 
of claim 6 and in particular, provides several examples of peptides comprising epitopes of this 
invention (e.g., SEQ ID NOs:2, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 and 22; 
see pages 7-13 for description of peptides and fragments of this invention and Examples 4 and 5, 
Figures 4-6 and Table 1) that are reactive with the EBV VCA-pl8 or VCA-p40 monoclonal 
antibodies of claim 6 and also provides examples of nucleic acid sequences encoding such 
peptides (e.g., SEQ ID NO:l, SEQ ID NO:3). Accordingly, one of skill in the art would 
recognize that applicants were in possession of the nucleic acid sequences of claim 6 at the time 
the present application was filed, as evidenced by the large numbers of representative species 
disclosed in the specification. 

Therefore, applicants respectfully submit that all of the members of the genus of nucleic 
acids of claim 6 are adequately defined both structurally and functionally, leading one of 
ordinary skill in the art to the reasonable conclusion that applicants were in possession of the 
invention of claim 6 at the time this application was filed. 

Claims 7 and 8 as presented herein respectively encompass a specific genus of nucleic 
acid sequences comprising the nucleotide sequence or a subsequence of SEQ ID NO:l, wherein 
the subsequence encodes a peptide that comprises an epitope that is immunochemically reactive 
with antibodies to EBV VCA-pl9 protein (claim 7), and a specific genus of nucleic acid 
sequences comprising the nucleotide sequence or a subsequence of SEQ ID NO:3, wherein the 
subsequence encodes a peptide that comprises an epitope that is immunochemically reactive with 
antibodies to EBV VCA-p40 protein (claim 8). 

In both claims 7 and 8, the subsequence is defined as encoding a peptide comprising an 
epitope. It would be readily recognized by one of skill in the art that applicants were in 
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possession of the genus of nucleic acid sequences of claims 7 and 8 at the time the application 
was filed because the specification provides several examples of peptides comprising epitopes of 
this invention (e.g., SEQ ID NOs: 2, 4,5,6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,21 and 
22; see pages 7-13 for description of peptides and fragments of this invention and Examples 4 
and 5, Figures 4-6 and Table 1) that are reactive with the EBV VCA-pl8 or VCA-p40 antibodies 
of this invention and also provides examples of nucleic acid sequences encoding such peptides 
(e.g., SEQ ID NO:l, SEQ ID NO:3). Thus, one of skill in the art would recognize that applicants 
were in possession of the nucleotide sequences and subsequences of claims 7 and 8 at the time 
the present application was filed, as evidenced by the large numbers of representative species 
disclosed in the specification. 

In particular, all of the members of the genus of the nucleotide sequences of claims 7 and 
8 could be readily identified by one of ordinary skill in the art on the basis of the disclosure of 
the nucleotide sequences of SEQ ID NO: 1 or SEQ ID NO:3. Such a genus is not overly broad, 
considering that every member must be a subsequence of a disclosed sequence (SEQ ID NO:l or 
SEQ ID NO:3), thereby defining the members of the genus structurally AND every member of 
the genus must also meet the functional requirement of encoding an EBV peptide comprising an 
epitope that is immunochemically reactive with antibodies to the EBV VCA-pl8 protein or the 
EBV VCA-p40 protein. Thus, all of the members of the genus of nucleic acids of claims 7 and 8 
are adequately defined both structurally and functionally, leading one of ordinary skill in the art 
to the reasonable conclusion that applicants were in possession of the invention of claims 7 and 8 
at the time this application was filed. 

Claims 9, 26 and 27 depend from claims 6, 7 and 8, respectively, and recite a vector 
molecule comprising the nucleic acid molecule of each respective independent claim. Because 
the nucleic acid sequences of claims 6, 7 and 8 are adequately described in the specification, the 
vectors of these claims are adequately described as well. 
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With regard to claims 32-33, the specification presents data that demonstrate that the 
inventors produced more than 330 12 mers of VCA-p40 and more than 160 12 mers of VCA-pl8 
as described in Examples 4 and 5 and as shown in Figures 4-6 and in Table 1 of the specification, 
thereby adequately describing the genus of 12 contiguous amino acids as set forth in these 
claims.. Specifically, Example 4 describes the production of a full set of peptides with a length 
of 12 amino acids and an overlap of 1 1 amino acids of the amino acid sequences of both ORFs 
BFRF3 (VCA-pl8) and BdRFl (VCA-p40) (page 30). These peptides were assayed for 
immunoreactivity with EBV-specific antibodies (Example 4, page 31 and Example 5, page 33) 
and results of these assays are shown for the VCA-pl8 peptides in Figures 4 and 5 and for the 
VCA-p40 peptides in Figure 6. Specifically, Figure 6 shows immunoreactivity results of almost 
340 peptides of VCA-p40 and Figures 4 and 5 show such results for more than 160 peptides of 
VCA-pl8. Thus, one of skill in the art would reasonably conclude that the peptides of claims 32 
and 33 are adequately supported in the specification 

Further, claim 34 as presented herein recites an isolated nucleic acid sequence encoding 
the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6 or a combination of both, wherein 
said amino acid sequence is immunochemically reactive with antibodies to the Epstein-Barr 
Virus VCA-pl8 protein. Thus, claim 34 recites a nucleic acid sequence encoding the specific 
amino acid sequences of SEQ ID NO: 5 and/or SEQ ID NO:6, which are disclosed in the 
specification at least on page 9, second paragraph. Thus, the nucleic acid sequences of claim 34 
are adequately supported in the specification. 

Thus, at least for the reasons set forth above, applicants believe that this rejection has 
been overcome and its withdrawal and allowance of the pending claims are respectfully 
requested. 

B. The Action states that claims 6-9, 26, 27 and 32-34 stand rejected under 35 U.S.C. § 
112, first paragraph, for allegedly failing to comply with the enablement requirement. 
Specifically, the Action states that to the extent that the claimed sequences are not adequately 
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described in the instant disclosure, claims 6-9, 26, 27 and 32-34 are also rejected under 35 U.S.C. 
§ 1 12, first paragraph, as allegedly containing subject matter which was not described in the 
specification in such a way as to enable one skilled in the art to make and/or use the invention. 

As discussed above, the subject matter of claims 6-9, 26, 27 and 32-34 is adequately 
described in the present specification. The specification not only adequately discloses the full 
genus of nucleic acid sequences of this invention, but also provides detailed teachings of how to 
make and use these nucleic acid sequences, See, in particular, the Examples set forth on pages 
22-26, wherein numerous working examples are provided of the production and testing of 
numerous peptides of this invention. Thus, applicants respectfully submit that the present 
invention is adequately enabled and applicants thereby respectfully request withdrawal of this 
rejection. 

II. Rejection under 35 U.S.C. $ 102(b) 

A. The Action states that claims 6-9, 26, 27 and 32-34 stand rejected under 35 U.S.C. § 
102(b) as allegedly anticipated by Laux et al. (EMBOJ. 7:769-774 (1988)). Specifically, the 
Action states that Laux et al. teaches a nucleic acid sequence comprising instant SEQ ID NO:l 
which encodes at least 12 contiguous amino acids of EBV VCA-pl8 (the amino acid sequence 
SEQ ID NO:5). The Action further states that Laux et al. teaches a nucleic acid sequence 
comprising a sequence that shares 98.8% homology with instant SEQ ID NO:3 (subsequence 
thereof), which encodes 12 contiguous amino acids of an EBV VCA-40. On this basis, the 
Action concludes that Laux et al. anticipates the instant claims. Applicants respectfully disagree 
and traverse this rejection. 

Specifically, applicants have performed multiple alignments comparing both the 
nucleotide and amino acid sequences of the present invention with those of Figure 2 of Laux et 
al. (NCBI Accession No. Y00835.1) and the sequence homology asserted in the Action to be 
present was not found (See enclosed Alignments 1-6). Accordingly, applicants respectfully 
submit that Laux et al. fails to disclose the nucleotide sequences of SEQ ID NO:l, or SEQ ID 
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NO:3 or any subsequences thereof encoding 12 contiguous amino acids, as claimed herein. If 
the Examiner maintains this rejection, it is respectfully requested that the Examiner specifically 
point out what portion of the sequences of Laux et al. have homology with the sequences of the 
present invention. Otherwise, applicants respectfully request that this rejection be withdrawn. 

B. The Action states that claims 6-9, 26, 27 and 32-34 stand rejected under 35 U.S.C. § 
102(b) as allegedly anticipated by Bankier et al. (Mol. Biol. Med.. 1 :425-445 (1983)). 
Specifically, the Action states that Bankier e al. teaches a nucleic acid sequence comprising 
instant SEQ ID NO:l, which encodes at least 12 contiguous amino acids of EBV VCA-pl8 (the 
amino acid sequence SEQ ID NO:5). The Action further states that Bankier et al. teaches a 
nucleic acid sequence comprising a sequence that shares 98.8% homology with instant SEQ ID 
NO: 3 (subsequence thereof), which encodes 12 contiguous amino acids of an EBV VCA-40. On 
this basis, the Action concludes that Bankier et al. anticipates the instant claims. Applicants 
respectfully disagree and traverse this rejection. 

Specifically, applicants have performed multiple alignments comparing both the 
nucleotide and amino acid sequences of the present invention with those of Figure 2 of Bankier 
et al. and the sequence homology asserted in the Action to be present was not found (See 
enclosed Alignments 7-38). Accordingly, applicants respectfully submit that Bankier et al. fails 
to disclose the nucleotide sequences of SEQ ID NO:l, or SEQ ID NO:3 or any subsequences 
thereof encoding 12 contiguous amino acids as claimed herein. If the Examiner maintains this 
rejection, it is respectfully requested that the Examiner specifically point out what portion of the 
sequences of Bankier et al. have homology with the sequences of the present invention. 
Otherwise, applicants respectfully request that this rejection be withdrawn 

The points and concerns raised in the outstanding Office Action having been addressed in 
full, it is respectfully submitted that all of the claims of this application are in condition for 
allowance, which action is respectfully requested. Should the Examiner have any remaining 



Attorney Docket No. 93 1*0. 1 3DVCTDV 
Serial No.: 10/036,729 
Filed: December 21,2001 
Page 10 of 10 



concerns, the Examiner is invited and encouraged to contact the undersigned attorney directly by 
telephone in order to expedite the prosecution of this application to allowance. 



The Commissioner is authorized to charge Deposit Account No. 50-0220 in the amount 
of $120.00 as fee for a one-month extension of time. This amount is believed to be correct. 
However, the Commissioner is authorized to charge any deficiency or credit any overpayment to 
Deposit Account No. 50-0220. 



Respectfully submitted, 

Mary L. Miller 
Registration No. 39,303 

Customer No. 20792 

Myers Bigel Sibley & Sajovec, P.A. 
P.O. Box 37428 
Raleigh, North Carolina 27627 
Telephone: (919) 854-1400 
Facsimile: (919) 854-1401 



CERTIFICATE OF EXPRESS MAILING UNDER 37 CFR 1.10 
"Express Mail" mailing label number: EV887524629US 

Date of Deposit: July 23, 2007 

1 hereby certify that this paper or fee is being deposited with the United States 
Postal Service "Express Mail Post Office to Addressee" service under 37 CFR 
1.10 on the date indicated above and is addressed to Mail Stop Amendment, 
Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450. 

J^C^aJL ULJM^lL 

Tracy Wallace A 
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9310-13DVCTDV SEQ ID NO l.xdna x Laux et al. EBV terminal gene.xdna => DNA Parallel 

DNA sequence 538 bp catgatggcacg . . . aaacagtagccc linear 

DNA sequence 2227 bp gcagtgtgtgaa . . . aaaaaaaaaaaa linear 



Method: Blocks (Martinez) 

Layout : Standard 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Translation : Off 



Alignment 1 . Comparison of nucleotide sequence 
of SEQ ID NO: 1 with the nucleotide sequence of 
Fig. 2 of Laux et al. 



1 gcagtgtgtgaagattgtcacagctgctggtttggagaaaacgggggtgggcggtgatca 6 0 

20 • 40 60 



6 1 gggagaacaattccccggggacacctgcacgagacccctgggctctcaggaactccgccc 12 0 

80 • 100 • 120 



121 aggtcttgccaattggggtgatcctgtagcgccgcggtttcagcatcacaggttattttg 180 

140 • 160 • 180 



181 cctgaagcttgctggggcgtaaatccctctcgccttgtttctcagagagcatttcaggcc 24 0 

200 • 220 • 240 



241 ggttttgcagtcgctgctgcagctatggggtccctagaaatggtgccaatgggcgcgggt 30 0 

260 • 280 • 300 



301 ccccctagccccggcggggatccggatgggtacgatggcggaaacaactcccaatatcca 3 60 

320 * • 340 • 360 



3 6 1 tctgcttctggctcttctgggaacacccccaccccaccgaacgatgaggaacgtgaatct 42 0 

380 • 400 * . 420 



421 aatgaagagcccccaccgccttatgaggacccatattggggcaatggcgaccgtcactcg 4 80 

440 • 460 • 480 



481 gactatcaaccactaggaacccaagatcaaagtctgtacttgggattgcaacacgacggg 54 0 

500 • 520 " • " 540 



541 aatgacgggctccctccccctccctactctccacgggatgactcatctcaacacatatac 60 0 

560 • 580 • 600 



9 
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601 gaagaagcgggcagaggaagtatgaatccagtatgcctgcctgtaattgttgcgccctac 660 

620 • 640 " • 660 



661 ctcttttggctggcggctattgccgcctcgtgtttcacggcctcagttagtaccgttgtg 72 0 

680 • 700 • 720 



721 accgccaccggcttggccctctcacttctactcttggcagcagtggccagctcatatgcc 78 0 

740 • 760 • 780 



7 81 gctgcacaaaggaaactgctgacaccggtgacagtgcttactgcggttgtcactttcttt 84 0 

800 • 820 • 840 



841 gcaatttgcctaacatggaggattgaggacccaccttttaattctcttctgtttgcattg 9 00 

860 • 880 • 900 



901 ctggccgcagctggcggactacaaggcatttacgttctggtgatgcttgtgctcctgata 960 

920 ■ 940 ■ 960 



961 ctagcgtacagaaggagatggcgccgtttgactgtttgtggcggcatcatgtttttggca 102 0 

980 ■ 1000 • 1020 



1021 tgtgtacttgtcctcatcgtcgacgctgttttgcagctgagtcccctccttggagctgta 10 80 

1040 • 1060 ■ 1080 

1 catg 4 

1 1 1 1 

1081 actgtggtttccatgacgctgctgctactggctttcgtcctctggctctcttcgccaggg 1140 

1100 • 1120 • 1140 



1141 ggcctaggtactcttggtgcagcccttttaacattggcagcagctctggcactgctagcg 12 00 

1160 • 1180 • 1200 



1201 tcactgattttgggcacacttaacttgactacaatgttccttctcatgctcctatggaca 12 6 0 

1220 • 1240 • 1260 



12 61 cttgtggttctcctgatttgctcttcgtgctcttcatgtccactgagcaagatccttctg 132 0 

1280 ■ 1300 ■ 1320 



1321 



gcacgactgttcctatatgctctcgcactcttgttgctagcctccgcgctaatcgctggt 
1340 • 1360 « 1380 



1380 
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13 81 ggcagtattttgcaaacaaacttcaagagtttaagcagcactgaatttatacccaatttg 14 4 0 

1400 ' 1420 • 1440 



1441 ttctgcatgttattactgattgtcgctggcatactcttcattcttgctatcctgaccgaa 1500 

1460 • 1480 • 1500 



1501 tggggcagtggaaatagaacatacggtccagtttttatgtgcctcggtggcctgctcacc 15 60 

1520 • 1540 • 1560 



15 61 atggtagccggcgctgtgtggctgacggtgatgtctaacacgcttttgtctgcctggatt 162 0 

1580 • 1600 • 1620 



1621 cttacagcaggattcctgattttcctcattggctttgccctctttggggtcattagatgc 168 0 

1640 ■ 1660 • 1680 



1681 tgccgctactgctgctactactgccttacactggaaagtgaggagcgcccaccgacccca 1740 

1700 • 1720 • " 1740 

5 atggca 10 

1 1 1 1 1 1 

17 41 tatcgcaacactgtataaagaatgcccaccagatcgcctgccacttccacagcaatggca 18 00 

1760 • 1780 • 1800 

20 -40 -60 

1 1 cgccggctgcccaagcccaccctccaggggaggctggaggcggattttccagacagtccc 7 0 
II " 

1801 C g================-========================================= 18 02 

80 • 100 • 120 

7 1 ctgcttcctaaatttcaagagctgaaccagaataatctccccaatgatgtttttcgggag 130 



140 ■ 160 • 180 

131 gctcaaagaagttacctggta-ttt-ct--gac--atcccagttctgctacgaag-agta 18 3 

I I Mill III II II 1 1 1 1 II I III II II 

18 03 ==========gatgcctggcgctttgctatgaattatccaagaaaccccacggagcaggg 185 2 

1820 • 1840 

184 eg tgcagag 192 

I Mill I 

185 3 caacattgcagggctctgttcacgcgatggtcgtcatctggctctcctgtgtgacccctc 1912 
1860 "*"" • 1880 ■ 1900 

200 

193 gacttttgg g 202 

MINIMI I 
1913 actttgtacagacttttggcaatgggagcacattccccccgcctttgggcaccccacggg 197 2 
1920 • 1940 • 1960 

220 ■ 240 ■ 260 

2 03 gtgcctcggcgccaacgcgccatagacaagaggcagagagccagtgtggctggggctggt 2 62 

MM " ' I 

19 73 gtgc ================================================= ======t 197 7 
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280 ■ 300 • 320 

2 63 gctcatgcacaccttggcgggtcatccgccacccccgtccagcaggctcaggccgccgca 

I I II III III I III I I I II I II MM 

197 8 ccccc t ggac a=c t t =atgtttcaagc age tcacct at =ggtca==ctcaggc ======= 

1980 • 2000 • 2020 

340 • 360 • 380 

32 3 tccgctgggaccggggccttggcatcatcagcgccgtccacggccgtagcccagtc-cgc 

I I II II I I I MINI I 

2 02 6 =============================acggtcg=cccctccgagtgaccag tcacct 

2040 



400 



382 



420 

-agcctccgggccgcga 



gaccccctctgtttcttcatctattagc 

I Mill I Mill 

2 05 6 tccagactatgcatacactgaatttagcctgatattgtccccctagcc=ccgggcc==== 
2060 • 2080 • 2100 ~ • 

440 • 460 ■ 480 

cttcgggggcgactgccgccgcctccgccgccgcagccgtcgataccgggtcaggtggcg 

i i ii ii in i i i ii ii i i 

==============cagc=cctcctcagaaaactctgcatgg==agaagctg 

2120 • 2140 
500 -520 
ggggacaaccccacgacaccgccccacgcggggcacgtaagaaacagtagccc 

I I 1 1 1 1 I I I I I II I I I I I I Mill II 

214 6 gacgtgaacctc=ccccccagacctgtgtgctgta=tttacaaacactacaataaaccca 
2160 2180 • 2200 



426 



21H ==== = = 



486 



322 
2025 

381 
2055 

425 
2110 

485 
2145 

538 
2203 



2204 atgtgcaaaaaaaaaaaaaaaaaa 2227 

2220 

% Identity = 7.0 (174/2484) 

/// 
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9310-13DVCTDV SEQ ID NO 3.xdna x Laux et al. EBV terminal gene.xdna => DNA Parallel 



DNA sequence 
DNA sequence 



1038 bp atgctatcaggt 
2227 bp gcagtgtgtgaa 



cgcgtggcttga linear 
aaaaaaaaaaaa linear 



Method: Blocks (Martinez) 

Layout : Standard 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Translation: Off 



Alignment 2 . Comparison of nucleotide sequence 
of SEQ ID NO:3 with the nucleotide sequence of 
Fig. 2 of Laux et al. 



1 , atgctatca 9 

i 1 1 1 1 

1 gcagtgtgtgaagattgtcacagctgctggtttggagaaaacgggggtgggcggtgatca 6 0 

2 0 • 4 0 ^ - - - 6Q 

20 / 40 
10 ggtaacgcaggagaa-ggag-caacagcctgcggaggttcggc 50 

II I II II Ml I II I I I III 

6 1 gggagaacaattccccggggacacctgcacgagacccctgggctctcaggaactccgccc 12 0 

80 • 100 • 120 

60 • 80 

51 cgccgcgggccaggacctcatcagcgtcccc 81 

MINIM I Mill 

121 aggtcttgccaattggggt gate ctgt age gccgcgg====tttc age ate a======== 168 

140 • 160 

100 • 120 • 140 

8 2 cgcaacacctttatgacactgcttcagaccaacctggacaacaaaccgccgaggcagacc 141 



160 • 180 • 200 

142 ccgctaccctacgcggccccgctgccccccttttcccaccaggcaatagccaccgcgcct 201 

I I III 

169 =================:============================caggttattttgcc= 182 

180 

220 

202 tcctacggtcctggggc eg 220 

I I I Miiiii M 

183 t=gaagcttgctggggcgtaaatccctctcgccttgtttctcagagagcatttcaggccg 241 

200 • 220 • 240 

221 g a 222 

I 

2 42 gttttgcagtcgctgctgcagctatggggtccctagaaatggtgccaatgggcgcgggtc 301 

260 • 280 • 300 - 

240 * 260 • 280 

223 gcggtcgccccggccggcggctactttacctccccaggaggttactacgccgggcccgcg 2 82 

I I IMMIII 

3 02 cccctagccccggc ==================== ======== ========= ========= 315 

300 ■ 320 
283 ggcggggacccgggtgccttcttggcgatggacgctcacacctac 327 

Mill MM II I MIMI II III II I 

316 ===ggggatccggatgggta=====cgatgg=cggaaacaactcccaatatccatctgct 36 6 
320 • . 340 • 360 

328 cacccccaccc 338 

1 1 1 ll 1 1 1 M l 

3 67 tctggctcttctgggaacacccccaccccaccgaacgatgaggaacgtgaatctaatgaa 42 6 
380 - 400 • 420 
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42 7 gagcccccaccgccttatgaggacccatattggggcaatggcgaccgtcactcggactat 4 86 
440 • 460 • 480 



4 87 caaccactaggaacccaagatcaaagtctgtacttgggattgcaacacgacgggaatgac 54 6 
500 • 520 • 540 

340 

339 acacccccctccggcctac 357 

I MINIM MM 

547 gggctccctccccctcc===ctactctccacgggatgactcatctcaacacatatacgaa 603 
560 • 580 • 600 



604 gaagcgggcagaggaagtatgaatccagtatgcctgcctgtaattgttgcgccctacctc 6 63 

620 • 640 • 660 

360 

358 -tttggct tgccg 369 

MINI Mill 

664 ttttggctggcggctattgccgcctcgtgtttcacggcctcagttagtaccgttgtgacc 72 3 

680 - 700 " • 720 



724 gccaccggcttggccctctcacttctactcttggcagcagtggccagctcatatgccgct 783 

740 ■ 760 • 780 



7 84 gcacaaaggaaactgctgacaccggtgacagtgcttactgcggttgtcactttctttgca 84 3 

800 • 820 • 840 



8 44 atttgcctaacatggaggattgaggacccaccttttaattctcttctgtttgcattgctg 90 3 

860 • 880 ■ 900 



9 04 gccgcagctggcggactacaaggcatttacgttctggtgatgcttgtgctcctgatacta 9 63 

920 ■ 940 • 960 



9 64 gcgtacagaaggagatggcgccgtttgactgtttgtggcggcatcatgtttttggcatgt 102 3 

980 1000 • 1020 



102 4 gtacttgtcctcatcgtcgacgctgttttgcagctgagtcccctccttggagctgtaact 10 83 

1040 ■ 1060 ■ 1080 



10 84 gtggtttccatgacgctgctgctactggctttcgtcctctggctctcttcgccagggggc 1143 

1100 ■ 1120 • 1140 



114 4 ctaggtactcttggtgcagcccttttaacattggcagcagctctggcactgctagcgtca 12 03 

1160 • 1180 ■ 1200 
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12 04 ctgattttgggcacacttaacttgactacaatgttccttctcatgctcctatggacactt 12 63 

1220 • 1240 • 1260 



12 64 gtggttctcctgatttgctcttcgtgctcttcatgtccactgagcaagatccttctggca 132 3 

1280 ■ 1300 " ■ 1320 



132 4 cgactgttcctatatgctctcgcactcttgttgctagcctccgcgctaatcgctggtggc 13 8 3 

1340 • 1360 • 1380 



13 84 agtattttgcaaacaaacttcaagagtttaagcagcactgaatttatacccaatttgttc 1443 

1400 • 1420 ■ 1440 



14 44 tgcatgttattactgattgtcgctggcatactcttcattcttgctatcctgaccgaatgg 15 0 3 

1460 • 1480 • 1500 



1504 ggcagtggaaatagaacatacggtccagtttttatgtgcctcggtggcctgctcaccatg 15 6 3 

1520 • 1540 • 1560 



15 64 gtagccggcgctgtgtggctgacggtgatgtctaacacgcttttgtctgcctggattctt 162 3 

1580 ' • 1600 • 1620 

380 

370 ggcctctttggcccccctccaccgtgc 39 6 

! Ml III III Ml 

162 4 acagcaggattcctgattttcctcattggctttgccctctttggggtcattagatgGtgc 1683 

1640 ■ 1660 • 1680 

400 

397 ct ccttac 404 

I 1 1 1 1 1 1 

1684 cgctactgctgctactactgccttacactggaaagtgaggagcgcccaccgaccccatat 1743 

1700 ■ 1720 • 1740 

420 

405 tacggattcccacttgcgggcagactacgtcc 436 

i i. i mill i mi i i i i 

17 4 4 cgcaacactgtataaagaatgcccaccagatcgcctgccacttcca==cag=caatggca 1800 

1760 • 1780 • 1800 

440 • 460 • 480 

437 ccgctccctcgcgatccaacaagcggaaaagagaccccgaggaggatgaagaaggcgggg 496 

I I I mi mi I ii i 

1801 cggatgcctggcgctttgctatgaatta======================== ======== 182 8 

1820 

500 • 520 • 540 

497 ggctattcccgggggaggacgccaccctctaccgcaaggacatagcgggcctctccaaga 55 6 

' ' II Mill 

182 9 =====================================================tccaaga 18 35 

560 • 580 ■ 600 

557 gtgtgaatgagttacagcacacgctacaggccctgcgccgggagacgctgtcctacggcc 616 
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620 • 640 • 660 

617 acaccggagtcggatactgcccccagcagggcccctgctacacccactcggggccttacg 67 6 

II II II 1 1 MM I III III I I II 

183 6 ==============aaccccacggagcagggcaacat=tgcaggg=ctctgttcacg=cg 1878 

1840 • 1860 
680 ■ 700 • 720 
677 gatttcagcctcatcaaagctacgaagtgcccagatacgtccctca 722 

II I II I II I Ml I IMIM 

187 9 atggtcgtcatc=tggct=ctcctgtgtgac=========ccctcactttgtacagactt 192 7 

1880 • 1900 • 1920 

723 tccgcccccaccacca 738 

II Ml I 

192 8 ttggcaatgggagcacattccccccgcctttgggcaccccacggggtgctccccctggac 198 7 
1940 • 1960 • 1980 
740 • 760 
739 acttct-caccaggcagctca ggcgcagcctccac 772 

MM I M IIMIMI III I I I II I 

19 8 8 acttatgtttcaagcagctcacctatggtcactcaggcacggtcgcccctccgagtgacc 2 04 7 
2000 " • 2020 • 2040 

773 ccccgg 778 

IMIM 

2 04 8 agtcaccttccagactatgcatacactgaatttagcctgatattgtccccctagccccgg 210 7 
2060 • 2080 • 2100 

780 • 800 • 820 

77 9 gcacacaggcccccgaagcccactgtgtggccgagtccacgatccctgaggcgggagcag 838 

II I Ml II II II " ' ' " 

2 108 gc=c=cagccctcctcagaa======================================== 212 5 

2120 

840 • 860 • 880 

839 ccgggaactctg gaccccgggaggacaccaaccctcagcagcccacc 885 

III Ml 

212 6 =====aactctgc a tggagaagctggac ================================ 214 8 

2 140 

900 • 920 • 940 

8 86 accgagggccaccaccgcggaaagaaactggtgcaggcctctgcgtccggagtggctcag 94 5 



960 • 980 • 1000 

94 6 tctaaggagcccaccacccccaaggccaagtctgtgtcagcccacctcaagtccatcttt 10 05 

I II II II MM II I I I I 

214 9 ========= ===============gtgaa=cct=cccccccagacct=gtgtgc=tgtat 218 0 

2160 • 2180 

1020 

1006 tgcgaggaattgctgaataaacgcg-tg-gcttga 1038 

I I III I IN I II II I 

2181 ttacaaacactac==aataaacccaatgtgcaaaaaaaaaaaaaaaaaa 2227 

2200 • 2220 

% Identity = 11.7 (323/2749) 

/// 



### DNA Strider 1.4f6 ### " Friday, July 13, 2007 4:46:04 PM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO 2.xprt x Laux et al. EBV terminal gene.xprt => Protein Alignment 

Protein sequence 177 aa MARRLPKPTLQG . . . DTAPRGARKKQ* 

Protein sequence 498 aa MGSLEMVPMGAG . . . ERPPTPYRNTV* 



Alignment 3 . Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO: 1 
(SEQ ID NO: 2) with the amino acid sequence encoded 
by the nucleotide sequence of Fig. 2 of Laux et al. 



Method: Diagonals (BLOSUM62) 

Layout : Standard 

Block Length £: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BL0SUM62 



1 MGSLEMVPMGAGPPSPGGDPDGYDGGNNSQYPS AS GSSGNTPTPPN DEERE SNEEPPPPY 6 0 

20 - 40 • 60 



61 EDPYWGNGDRHSDYQPLGTQDQSLYLGLQHDGNDGLPPPPYSPRDDSSQHIYEEAGRGSM 12 0 

80 • 100 « 120 



121 NPVCLPVIVAPYLFWLAAIAASCFTASVSTVVTATGLALSLLLLAAVASSYAAAQRKLLT 180 

140 • 160 • 180 



181 PVTVLTAVVTFFAICLTWRIEDPPFNSLLFALLAAAGGLQGIYVLVMLVLLILAYRRRWR 2 4 0 

200 • 220 • 240 



241 RLTVCGGIMFLACVLVLIVDAVLQLSPLLGAVTVVSMTLLLLAFVLWLSSPGGLGTLGAA 30 0 

260 ■ 280 • 300 

20 

1 MARRLPKPTLQGRLEADFPDSPLLPKFQELNQNNLPN 37 

M + TLL SLKL+L 
301 LLTLAAALALLASLILGTLNLTTMFLLMLLWTL VVLLICSSCSSCPLSK-ILLARLFL=Y 35 8 

320 • 340 

40 • 60 ■ 80 

3 8 DVFREAQRSYLVFLTSQFCYEEYVQRTFGVPRRQRAIDKRQRASVAGAGAHAHLGGSSAT 97 
+ S L+ S + + A + A GS 

359 ALALLLLASALIAGGSILQTNFKSLSSTEFIPNLFCMLLLIVAGILFILAILTEWGSGNR 418 
360 • 380 • 400 

100 • 120 • 140 

9 8 PVQQAQAAASAGTGALASSAPSTAVAQSATPSVSSS ISSLRAATSGATAAASAAAAVDTG 15 7 

+A+ T ++ + + + L A 
419 TYGPVFMCLGGLLTMVAGAVWLTVMSNTLLSAWILTAGFLIFLIGFALFGVIRCCRYCCY 4 7 8 
420 • 440 • 460 

160 

158 SGGGGQPH DTAPRGARKKQ* 177 
+ + P R * 

479 YCLTLESEERPPTPYRNTV* 498 
480 

% Identity = 4.6 (23/500) % Homology = 3.2 (16/500) % Total = 7.8 (39/500) 



/// 



### DNA Strider 1.4f6 ###' Friday, July 13, 2007 4:46:36 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 4xprt x Laux et al. EBV terminal gene.xprt => Protein Alignment 

Protein sequence 346 aa MLSGNAGEGATA . . . FCEELLNKRVA* 

Protein sequence 498 aa MGSLEMVPMGAG . . . ERPPTPYRNTV* 

Method: Diagonals (blosum62) Alignment 4, Comparison of the amino acid sequence 

Layout: standard encoded by the nucleotide sequence of SEQ ID NO* 3 

SSt^SLSy, S^ler (1, < SE <? "°*> ^ 3Cld ^dcd 

Gap penalty: Medium ( 2 ) b Y the nucleotide sequence of Fig. 2 of Laux et al. 

Display: BL0SUM62 

20 • 40 

1 MLSGNAGE-GATACG-GSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPF 5 8 

MS GA G GD + + + N P + P PP + 

1 MGSLEMVPMGAGPPSPGGDPDGYDGGNNSQYPSASGSSGNTPTPPNDEERESNEEPPPPY 60 

20 -40 • 60 

60 80 ■ 100 

59 SHQAIATAP — S- YGP-GAGAVAPAGGYFTSPG-GYYAGPAGG-DPGAFLAMD-AHTYHP 111 

SYPG + G G P D + + A 

61 EDPYWGNGDRHSDYQPLGTQDQSLYLGLQHDGNDGLPPPPYSPRDDSSQHIYEEAGRGSM 120 

80 ■ 100 - 120 

120 • 140 ■ 160 

112 HPHPPPAYFG — LPGLFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLF- 168 
+ PP LL C+S+ AS L 

,121 NPVCLPVIVAPYLFWLAAIAASCFTASVST VVTATGLALSLLLLAAVASSYAAAQRKLLT 18 0 

140 • 160 • 180 

180 • 200 • 220 

16 9 PGEDATLYRKDIAGLSKSVNELQHTLQALRRETLS YGHTGVGYCPQQGPCYTHSGPYGFQ 22 8 

PT A E L+GY ++ + 

181 PVTVLTAVVTFFAICLTWRIEDPPFNSLLFALLAAAGGLQGI YVLVMLVLLILAYRRRWR 2 4 0 

200 • 220 • 240 

240 • 260 • 280 

2 29 PHQSYEVPRYVPHPPPPPTSHQAAQAQPPPPGTQAPEAHCV-AE STIPE A-GAAGNSGPR 2 86 

+ + + T +A +GGG 

241 RLTVCGGIMFLACVLVLIVDAVLQLSPLLGAVTVVSMTLLLLAFVLWLSSPGGLGTLGAA 300 

260 • 280 • 300 

300 ■ 320 • 340 

2 87 EDTNPQQPTTEGHHRGKKLVQASASGVAQSKEPTTPKAKSV-S AHLKS-IFCEELLNKRV 344 

T L + + SS+SIL + 

301 LLTLAAALALLASLILGTLNLTTMFLLMLLWTLVVLLICSSCSSCPLSKILLARLFLYAL 36 0 

320 • 340 • 360 

345 a* 346 

A 

361 ALLLLASALIAGGSILQTNFKSLSSTEFIPNLFCMLLLIVAGILFILAILTEWGSGNRTY 420 

380 • 400 • 420 

421 GPVFMCLGGLLTMVAGAVWLTVMSNTLLSAWILTAGFLIFLIGFALFGVIRCCRYCCYYC 4 80 

440 • 460 ■ 480 

481 LTLESEERPPTPYRNTV* 498 



% Identity = 10.0 (50/498) 



% Homology = 4.8 (24/498) 



% Total = 14.9 (74/498) 



/// 



### DNA Strider 1.4f6 ###' Friday, July 13, 2007 4:47:29 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO S.xprt x Laux et aL EBV terminal gene.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 498 aa MGSLEMVPMGAG . . . ERPPTPYRNTV* 



Method: Diagonals (BL0SUM62) 

Layout : Standard 

Block Length <: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BLOSUM62 



20 



Alignment 5. Comparison of the amino acid sequence 
of SEQ ID NO: 5 with the amino acid sequence encoded 
by the nucleotide sequence of Fig. 2 of Laux et al. 



1 AVDTGSGG-G-GQPHDTA-PRGARKKQ 2 4 

G G P P G 

1 MGSLEMVPMGAGPPSPGGDPDGYDGGNNSQYPSASGSSGNTPTPPNDEERESNEEPPPPY 6 0 
• 20 • 40 ■ 60 



61 EDPYWGNGDRHSDYQPLGTQDQSLYLGLQHDGNDGLPPPPYSPRDDSSQHI YEEAGRGSM 120 

80 • 100 • 120 



121 NPVCLPVIVAPYLFWLAAIAASCFTASVSTVVTATGLALSLLLLAAVASS YAAAQRKLLT 180 
. ■ 140 ■ 160 • 180 



181 PVTVLTAVVTFFAICLTWRIEDPPFNSLLFALLAAAGGLQGI YVLVMLVLLILAYRRRWR 2 4 0 

200 • 220 • 240 



241 RLTVCGGIMFLACVLVLIVDAVLQLSPLLGAVTVVSMTLLLLAFVLWLSSPGGLGTLGAA 300 

260 ■ 280 • 300 



301 LLTLAAALALLASLILGTLNLTTMFLLMLLWTLVVLLICSSCSSCPLSKILLARLFLYAL 360 

320 • 340 • 360 



361 ALLLLASALIAGGSILQTNFKSLSSTEFIPNLFCMLLLIVAGILFILAILTEWGSGNRTY 42 0 

380 • 400 • 420 



4 2 1 GPVFMCLGGLLTMVAGAVWLTVMSNTLLSAWILTAGFLIFLIGFALFGVIRCCRYCCYYC 4 8 0 

440 • 460 • 480 



481 LTLESEERPPTPYRNTV* 



498 



% Identity - 1.0 (5/498) % Homology = 0.0 (0/498) % Total = 1.0 (5/498) 



/// 



### DNA Strider 1.4f6 ### Friday, July 13, 2007 4:47:42 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 6.xprt x Laux et al. EBV terminal gene.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV . . . LRAATSGATAAA 

Protein sequence 498 aa MGSLEMVPMGAG . . . ERPPTPYRNTV* 

Method: Diagonals (blosum62 ) Alignment 6. Comparison of the amino acid sequence 

_<: e-aa dard of SEQ ID NO. 6 with the amino acid sequence encoded 

Mismatch penalty: smaller (l) b y the nucleotide sequence of Fig. 2 of Laux et al. 

Gap penalty: Medium (2) 

Display: BLOSUM6^ 

1 MGSLEMVPMGAGPPSPGGDPDGYDGGNNSQYPSASGSSGNTPTPPNDEERESNEEPPPPY 60 

20 ■ , 40 • 60 

61 EDPYWGNGDRHSDYQPLGTQDQSLYLGLQHDGNDGLPPPPYSPRDDSSQHIYEEAGRGSM 12 0 

80 • 10 0 • 120 

121 NPVCLPVIVAPYLFWLAAIAASCFTAS VSTVVTATGLALSLLLLAAVASSYAAAQRKLLT 180 

140 • 160 • . 180 

181 PVTVLTAVVTFFAICLTWRIEDPPFNSLLFALLAAAGGLQGIYVLVMLVLLILAYRRRWR 2 4 0 

200 • 220 ■ 240 

241 RLTVCGGIMFLACVLVLI VDAVLQLSPLLGAVTVVSMTLLLLAFVLWLSSPGGLGTLGAA 3 00 

260 • 280 • 300 

301 LLTLAAALALLASLILGTLNLTTMFLLMLLWTLVVLLICSSCSSCPLSKILLARLFLYAL 360 

320 ■ 340 - 360 

361 ALLLLASALIAGGSILQTNFKSLSSTEFIPNLFCMLLLIVAGILFILAILTEWGSGNRTY 42 0 

380 • 400 • 420 

1 • STAVAQSATPSV 1-2 

+ 

421 GPVFMCLGGLLTMVAGAVWLTVMSNTLLSAWILTAGFLIFLIGFALFGVIRCCRYCCYYC 4 80 

440 ■ 460 • 480 

20 

13 SSS ISSLRAATSGATAAA 30 
+ S R T 

481 LTLESEERPPTPYRNTV* 498 
% Identity = 0.6 (3/498) % Homology = 0.4 (2/498) % Total = 1.0 (5/498) 

/// 



### DNA Strider L4f6 ### Friday, July 13, 2007 4:50:21 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragment.xdna => DNA Parallel 

DNA sequence 538 bp catgatggcacg . . . aaacagtagccc linear 

DNA sequence 12436 bp gaattctcaaag . . . tgtttagaattc linear 

Alignment 7 . Comparison of nucleotide sequence 
of SEQ ID NO: 1 with the nucleotide sequence of 
Fig. 2 of Bankier et al. 



1 gaattctcaaaggcggcaccctcgccggcgcgcctgtcctcccagggacccgagacgaag 60 

2 0 • 4 0 - 60 



61 gcccgtctgtagaggaagtggttgcgcatgcgggccagctcccagtagaccacgtccccc 12 0 

80 • 100 • 120 



Method: 
Layout : 

Mismatch penalty: 
Gap penalty: 
Translation: 



Blocks (Martinez) 
Standard 
Smaller (1) 
Medium ( 2 ) 
Off 



121 cagacgcgcaggcacagggtctcggtcagggtctcgctctgttgcgccaggcaggactgc 180 

140 • 160 ^ • 180 



181 age t t ggcc agaccc t c ggt ggcc acctggcgcaggtactgctccttgcgctt gage gcg 240 

200 * 220 : 240 



2 41 tccgagagggcgccggacgggccgggctctcgtgccccagccggccggggcacctccggg 3 00 

260 • 280 *" ■ 300 



301 ctctcccgggacgcctcctcctcgcctcggcccaaccgctgcatggctcggttgagccgc 3 60 

320 • 340 • 360 



361 gtgtacagctcgttcctcttttgcaggatggcccggtactgggggtgcgccgtgaaggcg 42 0 
• 380 400 • 420 



421 gcggcgcagtccgccttcagcgcctccaccgcgtcgcccgaggagctgtagaccccgccg 480 

440 ■ 460 ■ 480 



481 cagaagagccgctccgtggccccgggagccacggcgtcaaacaggtgagtcagccttgcc 54 0 

500 • 520 " • 540 

1 catg 4 

i 

541 cccgccagcgcctcctcgcaggccccccgcaccagggccaggcgacgctcccgggcaaac 600 

560 • 580 ■ ^ 600 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenffiffltf§7=>4:B<N21 MlfallePage 

20 

5 atggcacgccggctgcccaag-c-ccaccctcc 35 

Mill III I I I II I III III 

601 agggcagagaggcgggaatggccgccaccctccccctgccccgttgcaccgatagcatgg 6 60 

620 • 640 • 660 



661 ccgccagagttccaatagaggagctccgagagctccgccacctccgggggcactgtcgag 720 

680 ■ 700 • 720 



7 21 aagacgttgtaggtgtccagcgctctggtcgccccctctgcctccggccgccccgggccc 7 80 

740 ■ 760 • 780 



781 gggaccgcgccctcctctgggccgcccggcctcgccttctcctcagcctccaacaggtgc 840 

800 • 820 • 840 



841 ccgagccccgcctggcggacttcattctcaaacagtcccgagaccggctccggattcacc 900 

860 • 880 • 900 



901 ggcaccgccaggtggttacaggagacgtgggtcccctctgccgtggaagggttgccgtgg 960 

920 ■ 940 960 



961 ttgggcagaaccatcagctcgcccacacagcgccagcagggcacagaggtgatgtagagg 102 0 

980 1000 • 1020 



1021 cgcgggtctgggat gggacttac gccccgaaagcg gcc cage agate cagggcccgt tec 1080 
• " 1040 ' • 1060 • " 1080 



1081 aggctctccagccccatggtgtgagacatgcaataaaacacgcta ttgattctcttcatt 1140 

1100 " • 1120 ~ • 1140 



1141 aaaatctctatgtcatttattaggcacaaacttacatcgactttatgccccccgtaaaac 12 00 

1160 • 1180 ■ 1200 

36 a 36 

I 

12 01 tccacagagtacgcgactgagggggtacggagaggcgggacccgggtaccctttctacca 12 6 0 

1220 ■ 1240 - 1260 

40 

37 ggggag 42 

1 1 1 1 i 

12 61 ggggc gage age gc ggcagaggcctctctcgagttctct age aggtgeace age tccagg 132 0 

1280 • 1300 • 1320 



1321 gacagggcgctgcatgcacggtcattctgccgtctcaaacggggaaggaggatggcctcc 1380 

1340 • 1360 ' • ~ 1380 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenf7^§7=>4:BW2L MdallePage 



13 81 agctcggccagcaggccggcgttgcgcaccaccgcagccacgtccagactccgggggtcc 14 4 0 

1400 • 1420 • ' 1440 



1441 agccgggtgcacacgctcagctcaaccgccagggcgtacacctggctgtacgccgccgcc 150 0 

1460 • 1480 • 1500 



1501 agcagccccgacatcgccgccccaggggtctctagacctcgagtccggggagaacggtgg 1560 

1520 ■ 1540 • 1560 



1561 ccagacggcgcttgcgtctgcccccggagccctgccctcctccacccagcagcagcccgg 162 0 

1580 • 1600 • 1620 



1621 ccgaggcctgcgacgcggtgctgaccggctcggccacgctgataaagttgtcctgggctg 16 80 

1640 • 1660 • 1680 



1681 ccccgggcccaccccacactccctccagaaagtcccgagcggcctccgccgtccactcta 174 0 

1700 • 1720 ■ 1740 

43 gctggaggc 51 

Ml 1 1 MM 

1741 tcccgctggaggcaatggtcgccagggtttctaggacgctgtccgccaggacggagaagc 1800 

1760 • 1780 • 1800 



1801 ggcccaataagtactccgcgtcgtccctagtcagcgaggcgcatgcctcgcccatggcat 1860 

1820 ■ 1840 • 1860 



1861 ccacaaggttgcacaccacatcaaacacacagtcttcctcctgtttttgtgatataatgg 192 0 

1880 ■ 1900 • 1920 



1921 cctccaggccagccctgatgttctcaatctcatatgtggtcgcggcttgggtccggcgct 19 80 
• " 1940 • 1960 ■ 1980 



19 81 tcacggtcaaccctagggtgggggtggcaaagacaaacttcttccgcatggaagagcccc 2 040 

2000 - 2020 • 2040 



2 04 1 cggcctgcttgcgcagcccagccccgggggcctgcagcaggttcctgtccacgccccggc 2100 

2060 • 2080 2100 



2101 ccataaagtatcccaggttcccggcctggaatatctggttgttgccgttgacccccgtgt 2160 

2120 • 2140 • 2160 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenfMffl§7=>t:BWai WttillePage 

6 0 

52 ggattttc-cagac 64 

1 1 i nil 

2 161 acttgttgatggtcactggcagcgtgacaaccggacgggccttgcagacctggctaagac 22 2 0 

2180 • 2200 " - 2220 

80 ■ 100 • 120 

6 5 agtcccctgcttcctaaatttcaagagctgaaccagaataatctccccaatgatgttttt 12 4 

1 1 1 1 I I II I 1 1 11 I I Ml I I I 

2 221 agtc=============tgtggccgcgcag=accaccgt==ggt=cgcagt=aagggagg 2 2 62 

2240 • 2260 

140 • 160 • 180 

12 5 cgggaggctcaaagaagttacctggtatttctgacatcccagttctgctacgaagagtac 18 4 

II I III ill II I I I III Ml III I Mill 

2 2 63 aggtggcctccgcgtag==gcc==g==ctgccgac=tccaccgcccgc=gtgcccagtac 2 314 

2280 • 2300 

200 

185 gtgcagaggacttttggggtg cctcggc gccaacg 219 

III I.I II Mil lllllll II 

2 315 gtgggg=gtagtcacgggcgggcaccgactgcgtcctcggcaccagtccctgaatcaggc 2 37 3 
2320 • 2340 • 2360 

220 - 240 ■ 260 

22 0 cgccatagacaagaggcagagagccagtgtggctggggctggtgctcatgcacaccttgg 279 

I MM I I I I Mil I I I II I I I I Mill 

2 37 4 tgatgtagaactgggtctggccgcacgccttcaggatggcgttgttgagcctctgcttgg 2 4 33 
2380 • 2400 ■ 2420 

280 - 

280 eg ggtcatccgcc 292 

M I Ml 

2 4 34 cgtaagtgaccaggttgccaggcaccacatctatgacgttgctctcttcgtgggcccggg 2 4 93 
2440 • 2460 • 2480^ 

300 

293 a-cccccgtcca 303 

I Miiiiiiii 

2494 agcccccgtccacaaagagggccaggtcagagtactcctccgcgctggccccgctgggga 2553 
2500 • 2520 • 2540 



2 55 4 cagggaccgagcgccgcctggaaaagttgtgccacaggtacaggcttgagagcttagtgt 2 613 
2560 • 2580 • 2600 



2 614 ccgggaatagggtcttgtggtaggtgttgaggaatttcatgtagggcccgttgatgatgt 2 67 3 
2 620 • 2 64 0 - 2 6 60 • 



2 67 4 agttctccctcctggtagtggacttgatgaagctgttctggagggcggcattctcccccg 2 7 33 
2680 • 2700 • 2720 



2 734 tgaagaccaccctgttcttgatcttgatgttcctggggcacagcatcagcaccttggaca 2 7 93 
2740 • 2760 ■ 2780 



2 794 tgcgcacaggcagccgccggccgtacacccggccctgcagggccgcgtccaggtctggca 2 85 3 
2800 • 2820 • 2840 



2 85 4 ggtcgcaggtgggctccccatgcaccaccttggcctccttggccgtgaggacccccttgt 2 913 
2860 ■ 2880 • 2900 • 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenfMffl§7=>l:B<N2l MtiallePage 

304 gcaggctc 311 

1 1 1 1 1 ii 

2 914 cgatggccaggctcctaaagttggtgcacagcgtctggtagtgaccctttagccactctg 2 973 
2920 ■ • 2940 • 2960 



2 974 gggggctctggccaagcccggggttgtcattctcatagcacatacagatgggcagggaga 3 033 
2980 • 3000 • 3020 



3 03 4 tgtcctgcaggatggtcagcagtgagcggtaaaacagctgggtgaagatggggcaggcgg 30 9 3 
3040 ■ 3060 • 3080 



3 0 94 gctgcgcaaaggggttgcacgagtactgcatcacgtggtagcagctcttgaccaggtcct 315 3 
3100 • 3120 • 3140 " • 

312 aggccg 317 

III 

3154 tgtaggtgatgttgttcttggccatgctgttcataaactggaccacttcggcgtccaccg 3213 
3160 • 3180 ■ 3200 

320 

318 ccgcatcc 325 

MINIM 

3 214 ccgcatccacgtccttgaacatcttgacaaagtcacgcgggccatggggctccttctcta 32 7 3 
3220 • 3240 • 3260 



32 7 4 gctttccttcagcgtctatgcccagccgagacagccgctccagcaggttctggttcagct 33 33 
3280 • 3300 ■ 3320 



3 334 gccagtaggtgtagcggggctcgtcgtccggccgctgcccgtcgtcctccttatcgatga 3 39 3 
3340 ^ . 33 6 o . 3380 



33 9 4 agttgagaaagttgcccaaaaagtccgtctcgttgtaggagcccgaggcccccgagatca 34 53 

3400 • 3420 • 3440 • 

326 gctgg 330 

MM* ' 

34 5 4 cataggggtccctccgctgcgtggacatgacgggggggaagcggtccctcagcctaaaga 3513 

3460 • 3480 • 3500 



3514 agagcgtgttcaggcacacggccggggcccggccctcgcagagcgagcacatgggactgg 35 7 3 
3520 . • 3540 « 3560 



3 57 4 cggccgcccccgccacgtagctgcccgtctccggcaccggggtcagagagctcttc tgtc 3 63 3 
3580 • 3600 • 3620 



363 4 cctggcaaaactgcaggtagtaggcatagcgggcaagaaggttgggcgagaaggaggccg 3693 
3640 •■ 3660 ■ 3680 
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36 94 catagaccaggtgctccacagcgtagtttcccggaccgttggttccggtcacgtctggcc 3753 
3700 • 3720 • 3740 



3 75 4 caccccagcccgagaagcagggtcggcggcaggggtcccaggtcccctcctgcagggtcc 3 813 
3760 • 3780 • 3800 



3814 ccaggccgtgggtcatgtagaaactgttaaagagactctccttgccctgaccggttgact 387 3 
3820 • 3840 • 3860 



3 87 4 tcgagacccccgagacgtagaggacggaattggtggcaaagatctgcgtggacacgtggg 393 3 
3880 3900 ■ 3920 



3 93 4 gggccaggctggcattatatcggtgtaacgcagccacacgggcctctggaccctcacagt 3993 
3940 ■ 3960 • 3980 



3994 cggcaaacaggggccacgagtcgtagttgaggctggccggggtctcgtgcgaggcctcca 4 05 3 
4000 ■ 4020 • 4040 



4 05 4 gcatggcgggtgcgtagctcaccgccagctcgcatgccgcgctgtccacaatcattaagg 4113 
4060 ~ . 4080 . 4100 



4114 ctcccgagtccgggtgactgatggttgaggctgggaactccttgaggggggccaccttgg 417 3 
4120 . 4140 ■ 4160 ~ " • 



417 4 ccaccttggcctggtcctgcaggctctgcttctccagcagctccaccagcttgcccaccc 4233 
4 18 0 • 4 2 0 0 • ' 4 2 2 0 



4 2 34 gtcggacgcgcagcgcctgcgccagcccggtgtacagcgcctcgtgcatgcagcggctga 4 2 93 
4240 • 4260 • 4280 ~ • 



42 94 ggtccgagttgtaaaactggcggagctggggcacgccctctgggaacacctccttgtcgt 4 353 
4300 • 4320 • 4340 



4 354 agagcgggaccctaacgctcgcagactgccccaccgctacctcctgttttaacgatggaa 4413 
4360 : ~ 4380 • 4400 



4414 



tggccaccaggtttccgctgtagagtcgctccttgaaggcctcggttattgccaccgccc 
4420 • 4440 • 4460 



4473 
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4 474 ccaggtaggcagagggatctagcccttcggggaagaagtcccccggcttggagctttccc 4533 
4480 • 4500 • 4520 



4534 tcggtagggcgctgtaggcgtcgtacccaaacacctccctggtctcgccacagagggcct 45 93 
4540 • 4560 • 4580 



4594 cgagacccggcccctcaaagatggggggaaccatatgggcattgtggaacacgtagatgt 4 653 
4600 • 4620 ■ 4640 



4 654 ccctgtgataggaggtagcgcgtaggagcccgcagttggggtcgggcctcctgtgcagag 4 713 
4660 • 4680 ■ 4700 " - 



4714 ccttgacattgatgctgaagcccggctccacggtgatgccgcaaaggagcggcaccgtca 47 7 3 
4720 ■ 4740 • 4760 



4774 ggcacctgtggcccgcgtagccggtccccagtgtggccacctccctaagagggtaggtgg 4 8 33 
4780 • 4800 • 4820 



4 8 34 ccagggggtaaaagtagatgtagccgcacggacccggctggctctggctgcccagattat 48 93 
4840 • 4860 • 4880 



4 8 94 cctcgctagtctgtgcaccctgcatgatgcccaaggtatcgccccggcctcccagtccca 4 953 
4900 • 4920 • 4940 



4 95 4 cattaaatgttacactttactcatcacgcaacacccactgtttat tcatttacaaagatt 5013 
4960 • 4980 -5000 



5 014 tcaggaagtcagtcaggctggccagggcccacgtcacggggaactgacgtctc age gate 5 07 3 
5020 • 5040 • 5060 



5074 ttggcatgccgcccagcctcgcaaaccagagtctgcgatagagggccaggtagtgggcga 5133 
5080 • 5100 • " 5120 " " • 



5134 ttgcccccagcacgaaggcggcgctcttgtggtcatccaggtagtttcgcaccgcaaaca 5193 
5140 • 5160 • ~" 5130 



5194 



ccactgtgtagcacagcaccaccctgagccgcgaccagtagtcgtagtggtcgttgtaca 
5200 • 5220 • 5240 



5253 
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5254 ctgcgcgcaggacgctgatgatgagccgtacgtgcgtgtctttgcccccgatgtcggctg 5313 
5260 • 5280 • 5300 



5314 tcctgcaggccagctccgcgtacagcttcctatccttcctcagggaggccttgatgagcc 537 3 
5320 • 5340 • 5360 



53 7 4 ggcagaggaccagggctggcaaaggcaggtctttctcatcccgggtgaacaccgcgtaca 54 33 
5380 • 5400 ■ 5420 



5434 tggccctgaacatgaggtagctggactcagccaccttgtcgtccggcggcgagggcgcga 54 93 
5440 • 5460 • 5480 ~ • 

331 gaccgggg : 338 

II MINI 

54 9 4 cccacgcctcgaccggggtcctcacaaacacagaatctgtagacttggctggcctcatgg 55 5 3 
5500 • 5520 • 5540 



5554 tctcgtcaggccagctcacgggcttcaggcttatatgataaaatgggcgtggcagaatag 5 613 
5560 • 5580 ■ 5600 • 



5 614 tataagacgcgaggcctgggtgaggagagtccagagcaatggccaggttcatcgctcagc 5673 
5620 • 5640 • 5660 



5674 tec tec tgttggcc tec tgtgtggccgccggccaggctgtcaccgctttcttgggt gage 57 33 
5680 • 5700 • 5720 • 



57 3 4 gagtcaccctgacctcctactggaggagggtgagcctcggtccagagattgaggtcagct 5 7 93 
5740 • 5760 ■ 5780 



5 7 94 ggtttaaactgggcccaggagaggagcaggtgcttattgggcgcatgcaccacgatgtca 5 85 3 
5800 • 5820 • 5840 



5 854 tctttatagagtggcctttcaggggcttctttgatatccacagaagtgccaacaccttct 5 913 
5860 • 5880 ■ 5900 



5914 ttttagtagtcaccgctgccaacatctcccatgacggcaactacctgtgccgcatgaaac 5973 
5920 " - 5940 • 5960 



597 4 tgggcgagaccgaggtc ace aagcaggaac ace t gage gtggtgaagcctc taacgctgt 6033 
5980 ■ 6000 • 6020 
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6 034 ctgtccactccgaaaggtctcagttcccagacttctctgtccttactgtgacatgcaccg 6 0 93 
6040 • 6060 ■ 6080 " " ■ 



6 094 tgaatgcatttccccatccccacgtccagtggctcatgcccgagggcgtggagcccgcac 6153 
6100 • 6120 • 6140 



6154 caactgcggcaaatggcggtgttatgaaggaaaaggatgggagcctctctgttgctgttg 6213 
6160 • 6180 • 6200 



6214 acctgtcacttcccaagccctggcacctgccagtgacctgcgttgggaaaaatgacaagg 62 7 3 
6220 ' • 6240 • 6260 



6274 aggaagcccacggggtttatgtttctggatacttgtcgcaataaacgcacttgcctattt 63 33 
6280 ^ • 6300 • 6320 



6 3 34 caccttgttttagtgtggcattgggggggtggcattgcgggtggatagcctcgcgactcg 63 93 
6340 • 6360 • 6380 



6 3 94 tgggaaaatgggcggaagggcaccgtgggaaaatagttccaggtgacagcagcagtgtgt 6453 
6400 • 6420 . ■ 6440 



6 454 gaagattgtcacagctgctggtttggagaaaacgggggtgggcggtgatcagggagaaca 6513 
6460 - 6480 ■ 6500 



6 514 attccccggggacacctgcacgagacccctgggctctcaggaactccgcccaggtcttgc 65 7 3 
6520 • 6540 • 6560 



65 7 4 caattggggtgatcctgtagcgccgcggtttcagcatcacaggttattttgcctgaagct 6 6 33 
6580 " : 6600 • 6620 



6 6 34 tgctggggcgtaaatccctctcgccttgtttctcagagagcatttcaggccggttttgca 6 693 
6640 ■ 6660 • 6680 



6 6 94 gtcgctgctgcagctatggggtccctagaaatggtgccaatgggcgcgggtccccctagc 6 753 
6700 • 6720 • 6740 



6 754 cccggcggggatccggatgggtacgatggcggaaacaactcccaatatccatctgcttct 6813 
6760 • 6780 • 6800 
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6 814 ggctcttctgggaacacccccaccccaccgaacgatgaggaacgtgaatctaatgaagag 687 3 
6820 • 6840 • 6860 



6874 cccccaccgccttatgaggacccatattggggcaatggcgaccgtcactcggactatcaa 69 3 3 
6880 • 6900 • 6920 



6 934 ccactaggaacccaagatcaaagtctgtacttgggattgcaacacgacgggaatgacggg 6 9 93 
6940 . • 6960 " ■ " 6980 



6 9 94 ctccctccccctccctactctccacgggatgactcatctcaacacatatacgaagaagcg 7 053 
7000 • 7020 ■ 7040 



7 05 4 ggcagaggaaggtaagagtgccatctatctgtacttttatttattgcatcacaagtcaca 7113 
7060 ■ 7080 • 7100 



7114 tcaataataagggcgccatctagcgggagatgttatccacaccatcccaattcacatctc 7173 
7120 • 7140 • 7160 



7174 agggacaacaggtcaaagttctttgttgacacccccagcgctggctccagggggtggaag 72 3 3 
7180 ^ • 7200 « " 7220 



72 34 cgttggatgcagtcctccgcatcggggcggacgcctcctcccaacgcgtttctgcggatc 7 2 93 
7240 • 7260 • 7280 



7 2 94 agtcgctggctggtgggcatcggagtcggtgggcggtcctccacggggacacgctccttc 7 35 3 
7300 • 7320 • 7340 



7 35 4 ttggccttgttctttgaccttttggacattcttctgaaggaacggcggagagtagcgtag 7 413 
7360 • 7380 • 7400 



7414 aatccagccagtggtctacccggtcgcatggtggcttcttagatgaggagcaggcataaa 7 4 73 
7420 * 7440 • 7460 



7 4 74 agtccaaacaggacacagagtaccaccaggagtagtcttagtctgctgacgtctgggtcc 75 33 
7480 • 7500 • 7520 



7534 



tcggggcaggggtggctaggcctggtctccgtagaagagccgggcaggccgcaggcagag 
7540 " ~ ■ 7560 . 7580 ^ . 



7593 
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75 94 gactgctgctctagcaaagcacgctccaggacgtgtaccatctcgagagtgaggcacagc 7 65 3 
7600 " • 7620 ■ 7640 



7 654 tgttttcgtggacttttatacagtaaggacaaggaaagaaggccagaggaatgtggaaag 7713 
7660 • 7680 ■ 7700 



7 7 14 atgagcgaggacaggtgtggaggttttgggctagctcttagtttctgggtgtgagagagg 7773 
7720 • 7740 • 7760 



77 74 gattaaagtgcttatgcgcaaagaatgtgtcaacaacaggtgttcctgcctctgctggca 7833 
7780 • 7800 ■ 7820 



7 8 34 tgagttaggtgtggcttgggctgaatccaaatgtgtattggcacaagatggaaagcaaag 7893 
7840 • 7860 • 7880 



7 894 ttgctggagttactgggtgggagacagggatgtatgtggtcccccgctggtatgccagta 7 953 
7900 • 7920 • 7940 

340 

339 cct 341 

I I I 

7 954 ccctgtggaagtaaggggcctcatctgcctggtagttgtgttgtgcagaggtctgatgtg 8013 
7960 • 7980 ■ 8000 



8014 tgtaggaggggtgggttcaacgcaggggcgttggtggcggagtctggcaacgcccgggtc 8 07 3 
8020 • 8040 • 8060 



8074 cttgctacctgtgtggtgtgttaagggctgggtaaaggtgtctgccaattctcgcatgtc 8133 
8080 ■ 8100 • 8120 * 



8134 ctcctttccccttgttttgaaatagaatatgaatgtggcttttcagcctagacagacagt 819 3 
8140 " • 8160 • 8180 



8194 gtggctaagggagtgtgtgccagttaaggtgattagctaaggcattcccagtaaatggag 8253 
8200 • 8220 • 8240 



825 4 ggagagtcagtcaggcaagcctatgacatggtaatgcctagaagtaaagaaaggttagtc 8313 
8260 • 8280 • 8300 

360 

342 tggcatcatcagcgccgtcca-cggccg 368 

MINIM MM III 

8314 atagtagcttagctgaactgggccgtgggggtcgtcatcatc=======tccaccggaac 8 36 6 

8320 • 8340 • 8360 
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380 • 400 • 420 

36 9 tagcccagtccgcgaccccctctgtttcttcatctattagcagcctccgggccgcgactt 42 8 

II III I I I I I I II I I III II Ml! I 

83 6 7 cagaagaacccaaaagcagcgtaggaaggt=gtggatca=ccgccgccatggc=cggaat 84 2 3 

8380 • 8400 • 8420 
440 ■ 460 
429 cgggggcgactgccgccgcctccgcc-gccgcagccgtc 466 

i ill iiiiiiiiiiii mi ii 

84 2 4 c=atgactatgaccgccgcctccgtctgtcatcaaaggcgggccctggtcacctcctttg 8 4 82 

8440 • 8460 • 8480 



8 4 83 ttttcaacctcttccgtcaattgtggagggcctccatcatttccagcagagtcgctaggg 85 42 

8500 • 8520 • 8540 



85 4 3 ctatgaggcagcgggtcatgtgggccattgtcatcagtgttgtcagggtcctgtgggcca 8 6 02 

8560 • 8580 • 8600 



8 6 03 ttgtcatcagtgttgtcagggtcctgaggcagcgggtcatgtgggccattgtcatcagtg 8 6 62 

8620 • 8640 • 8660 



8 6 63 ttgtcagggtcctgtgggccattgtcatcagtgttgtcagggtcctgtgggccattgtca 8722 

8680 • 8700 • 8720 



8 72 3 ggaccacctccaggtgcgcctaggttttgagagcagagtgggggtccgtcgccggctcca 87 82 

8740 ■ 8760 • 8780 



8 7 83 ctcacgagcaggtggtgtctgccctcgttggagttagagtcagattcatggccagaatca 8842 

8800 • 8820 ~ • 8840 



8 84 3 tcggtagcttgttgagggtgcgggagggagtcatcgtggtggtgttcatcactgtgtcgt 8 9 02 

8860 • 8880 ■ 8900 



8 9 03 tgtccatggtaatacatccagattaaaatcgccagaaacaggaggagccaaaggagatca 8 9 62 

8920 • 8940 ■ 8960 



8 9 63 accaatagagtccaccagttttgttgtagatagagagcaataatgagcaggatgaggtct 9022 

8980 • 9000 • ~ 9020 



9 02 3 aggaagaaggctaggaagaaggccaaaagctgccagatggtggcaccaagtcgccagagc 9082 

9040 • 9060 • 9080 

467 gatacc 472 

mill 

9 0 83 atctccaataagtagatccagatacctaagactgcgttgaaaaaagagtgttagggttgg 9142 

9100 ■ 9120 • 9140 
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9143 aaaagtgggggtgtggtaaataattcctagggaatgttagatcttaccaagtaagcaccc 92 02 

9160 • 9180 • " 9200 



92 0 3 gaagatgaacagcacaattccaaggaacaatgcctgtccgtgcaaattccagagagcgat 92 62 

9220 • 9240 • 9260 



92 6 3 gagcaggagggtgactggggaaagaggagaaagtgcgttagagaaggaagagtaagggaa 932 2 

9280 • 9300 • 9320 



932 3 agggggtgtggggcaaagggtgtaatacttactcatcagtaggagtatacaaagggctcc 93 82 

9340 • 9360 • 9380 



9383 aagtggacagagaaggtctcttctgaagataaagatgatcaaaattataattataagcat 9 44 2 

9400 • 9420 • 9440 

480 • 500 • 520 

473 gggt-caggtggcgggggacaaccccacgaca-ccgccccacgcggggca 520 

I I III II II I I Ml II I II II I I 

944 3 gagagcaaaggaatagaggacaaggagggctcctccagtccagtcactcataacgatgta 95 0 2 

9460 • 9480 • 9500 

521 cgtaagaaacagtagccc 538 

I I i mi 1 1 mi™ 

9503 cagccaaaacagtagcgccaagaggaggagaaggagagcaaggcctagggaagaggagag 9562 

9520 ■ 9540 ' • 9560 



95 6 3 ggggggtcctcgagggggccgtcgcgggcccggtgggcccctctcaaggtcgtgttccat 962 2 

9580 • 9600 9620 



9 62 3 cctcagggcagtgtgtcaggagcaaggcagttgaggaaagaagggggcagagcagtgtga 9 682 

9640 • 9660 • 9680 



968 3 gaggcttatgtagggcggctacgtcagagtaacgcgtgtttcttgggatgtaggcccggg 9742 

9700 ■ 9720 • " 9740 



974 3 gggatttgcggggtctgccggaggcagtacgggtacagatttcccgaaagcggcggtgtg 9802 
• ^ 9760 • 9780 • 9800 



9 8 03 tgtgtgcatgtaagcgtagaaaggggaagtagaaagcgtgtgtttgtgttagaaaagcgg 9 8 62 
• 9820 • 9840 • 9860 



9 8 63 gtccccggggggcaagctgtgggaatgcggtggccaagtgcaacaggaaatggaaaggca 992 2 

9880 • 9900 • 9920 
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9 92 3 gtgcggcaatcagaagggggagtgcgtagtgttgtgggaagcggcagtgtaatctgcaca 9 9 82 

9940 • 9960 • 9980 



9 9 83 aagaggcgcggggcgcgcaacgttgggaggtcgttggcggcaggcgggaggccgtgcttt 1004 2 

10000 • 10020 • 10040 



10043 aggggggttcaggtgaggcaaggctgtggggtaaccgtaggggaggcgggtgaggcggct 10102 

10060 • 10080 • " 10100 



10103 aagagggctaagggtcggcgggtgacgaagcagcagacggcggatatgggaatttcagaa 1016 2 

10120 • 10140 10160 



10163 tgaggtggcggattcaggcgaaaagggtgtgggctgtgcgagtgtcatgaggcaggcgcg 10222 
• ~~ 10180 • 10200 ■ 10220 



102 2 3 gaaagtcgctgcggcttgctggggcatggggggccgcgcattcctggaaaaagtggaggg 10282 

10240 ■ 10260 ■ 10280 



10283 ggcgtggccttcccccgcggccccccagcccccccgcacagagcggcgctacggcgggcg 10342 

10300 • 10320 ■ 10340 



1034 3 ggcggcggggggtcggggtccgcgggctccgggggctgcgggcggtggatggcggcggac 104 02 
• ~ 10360 • 10380 ■ 10400 



104 03 gttccggggatcgggggggtcggggggcgccgcgcgggcgcagccatgcgtgaccgtgat 104 62 

10420 • 10440 • 10460 



104 63 gagggggcagggtcgcagggggtgtgtctggtgggggcgggagcggggggcggcgcggga 10522 

10480 • 10500 • 10520 



1052 3 gcctgcacgccgttggagggtagaatgacagggggcggggacagagaggcggtcgcgccc 10582 

10540 • 10560 • 10580 



10583 ccggccgcgccagccaagcccccaaggggggcggggagcgggcaatggagcgtgacgaag 10 642 

10600 • 10620 ■ 10640 



1064 3 ggccccagggctgaccccggcaaacgtgacccggggctccggggtgacccagccaagcgt 107 02 

10660 " ■ 10680 • 10700 
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10703 gaccaaggggcccgtgggtgacacaggcaaccctgacaaaggccccccaggaaagacccc 107 62 

10720 ■ 10740 • 10760 



10763 cggggggcatcggggggtggggcatggggggccgcgcattcctggaaaaagtggaggggg 10 822 

10780 • 10800 • 10820 



10 82 3 cgtggccttcccccgcggccccccagcccccccgcacagagcggcgctacggcgggcggg 10882 

10840 • 10860 • 10880 



10 883 cggcggggggtcggggtccgcgggctccgggggctgcgggcggtggatggcggcggacgt 1094 2 

10900 • 10920 • 10940 



109 4 3 tccggggatcgggggggtcggggggcgccgcgcgggcgcagccatgcgtgaccgtgatga 11002 

10960 • 10980 • 11000 



11003 gggggcagggtcgcagggggtgtgtctggtgggggcgggagcggggggcggcgcgggagc 11062 

11020 • 11040 • 11060 



110 6 3 ctgcacgccgttggagggtagaatgacagggggcggggacagagaggcggtcgcgccccc 1112 2 
• " 11080 ' - 11100 • 11120 



1112 3 ggccgcgccagccaagcccccaaggggggcggggagcgggcaatggagcgtgacgaaggg 11182 

11140 • 11160 ■ - 11180 



11183 ccccagggctgaccccggcaaacgtgacccggggctccggggtgacccagccaagcgtga 11242 

11200 • 11220 • 11240 



112 4 3 ccaaggggcccgtgggtgacacaggcaaccctgacaaaggccccccaggaaagacccccg 113 02 

11260 • 11280 • 11300 



113 03 tggggcatggggggccgcgcattcctggaaaaagtggagggggcgtggccttcccccgcg 11362 

11320 • 11340 • 11360 



113 63 gccccccagcccccccgcacagagcggcgctacggcgggcgggcggcggggggtcggggt 114 2 2 

11380 • 11400 ^ • 11420 



114 2 3 ccgcgggctccgggggctgcgggcggtggatggcggcggacgttccggggatcggggggg 11482 

11440 • 11460 • 11480 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenradrttf =X:fIM3l PMallePage 



114 83 tcggggggcgccgcgcgggcgcagccatgcgtgaccgtgatgagggggcagggtcgcagg 115 4 2 

11500 • 11520 • 11540 



115 4 3 gggtgtgtctggtgggggcgggagcggggggcggcgcgggagcctgcacgccgttggagg 11602 

11560 • .11580 • 11600 



11603 gtagaatgacagggggcggggacagagaggcggtcgcgcccccggccgcgccagccaagc 116 6 2 

11620 • 11640 • " 11660 



11663 ccccaaggggggcggggagcgggcaatggagcgtgacgaagggccccagggctgaccccg 11722 

11680 11700 • 11720 



11723 gcaaacgtgacccggggctccggggtgacccagccaagcgtgaccaaggggcccgtgggt 11782 

11740 • 11760 - 11780 



11783 gacacaggcaaccctgacaaaggccccccaggaaagacccccggggggcatcggggggtg 11842 

11800 • 11820 ~ • 11840 



11843 gggcatggggggccgcgcattcctggaaaaagtggagggggcgtggccttcccccgcggc 11902 

11860 • 11880 • 11900 



119 03 cccccagcccccccgcacagagcggcgctacggcgggcgggcggcggggggtcggggtcc 11962 

11920 • 11940 " " • ~ 11960 



119 63 gcgggctccgggggctgcgggcggtggatggcggcggacgttccggggatcgggggggtc 12 02 2 

11980 • 12000 • 12020 



12 02 3 ggggggcgccgcgcgggcgcagccatgcgtgaccgtgatgagggggcagggtcgcagggg 12 0 82 

12040 ■ 12060 • " 12080 



12 0 83 gtgtgtctggtgggggcgggagcggggggcggcgcgggagcctgcacgccgttggagggt 12 142 

12100 • 12120 • 12140 



1214 3 agaatgacagggggcggggacagagaggcggtcgcgcccccggccgcgccagccaagccc 12 2 02 

12160 • 12180 • 12200 



12203 ccaaggggggcggggagcgggcaatggagcgtgacgaagggccccagggctgaccccggc 12 2 62 

12220 • 12240 - 12260 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmen?,/H3tfH7 =*:SIN2* P&fallePage 



12 2 63 aaacgtgacccggggctccggggtgacccagccaagcgtgaccaaggggcccgtgggtga 12 32 2 

12280 ■ 12300 • 12320 



12 323 cacaggcaaccctgacaaaggccccccaggaaagacccccggggggcatcggggggggtg 12 3 82 

12340 • 12360 ' ~ ' • 12380 



12383 ttggcgggggcatgggggggtcggatttcgcccttattgccctgtttagaattc 12 436 

12400 • 12420 

% Identity = 2.3 (291/12474) 



/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 4:53:21 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragment complement.xdna => DNA Parallel 

DNA sequence 538 bp catgatggcacg . . . aaacagtagccc linear 

DNA sequence 12436 bp gaattctaaaca . . . ctttgagaattc linear 



Method: Blocks (Martinez) 

Layout : Standard 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Translation: Off 



Alienment 8 . Comparison of nucleotide sequence 
of SEQ ID NO: 1 with the complement of the 
nucleotide sequence of Fig. 2 of Bankier et al. 



1 gaattctaaacagggcaataagggcgaaatccgacccccccatgcccccgccaacacccc 6 0 

20 -40 • 60 



6 1 ccccgatgccccccgggggtctttcctggggggcctttgtcagggttgcctgtgtcaccc 12 0 

80 • 100 ■ 120 



121 acgggccccttggtcacgcttggctgggtcaccccggagccccgggtcacgtttgccggg 18 0 
. 140 • 160 " " • " 180 



181 gtcagccctggggcccttcgtcacgctccattgcccgctccccgccccccttgggggctt 24 0 

200 • 220 ■ 240 



2 41 ggctggcgcggccgggggcgcgaccgcctctctgtccccgccccctgtcattctaccctc 3 00 

260 ■ 280 • 300 



301 caacggcgtgcaggctcccgcgccgccccccgctcccgcccccaccagacacaccccctg 360 
■ 320 " • 340 " • 360 



361 cgaccctgccccctcatcacggtcacgcatggctgcgcccgcgcggcgccccccgacccc 42 0 

380 • 400 • ~ 420 



421 cccgatccccggaacgtccgccgccatccaccgcccgcagcccccggagcccgcggaccc 48 0 

440 • 460 • 480 



4 81 cgaccccccgccgcccgcccgccgtagcgccgctctgtgcgggggggctggggggccgcg 54 0 

500 • 520 • 540 



541 ggggaaggccacgccccctccactttttccaggaatgcgcggccccccatgccccacccc 6 00 

560 ■ 580 • 600 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenf/i8ift|>leih«t3fflrfavi=> IHgA Parallel 



601 ccgatgccccccgggggtctttcctggggggcctttgtcagggttgcctgtgtcacccac 6 60 

620 640 • 660 



661 gggccccttggtcacgcttggctgggtcaccccggagccccgggtcacgtttgccggggt 72 0 

680 - 700 • 720 



721 cagccctggggcccttcgtcacgctccattgcccgctccccgccccccttgggggcttgg 78 0 

740 • 760 ■ 780 



781 ctggcgcggccgggggcgcgaccgcctctctgtccccgccccctgtcattctaccctcca 8 40 

800 • 820 • 840 



841 acggcgtgcaggctcccgcgccgccccccgctcccgcccccaccagacacaccccctgcg 9 00 

860 • 880 ■ 900 



901 accctgccccctcatcacggtcacgcatggctgcgcccgcgcggcgccccccgacccccc 9 60 

920 ■ 940 • 960 



9 61 cgatccccggaacgtccgccgccatccaccgcccgcagcccccggagcccgcggaccccg 102 0 

9 8 0 • 10 0 0 -10 2 0 



1021 accccccgccgcccgcccgccgtagcgccgctctgtgcgggggggctggggggccgcggg 108 0 

1040 ■ 1060 • 1080 



10 81 ggaaggccacgccccctccactttttccaggaatgcgcggccccccatgccccacggggg 114 0 

1100 • 1120 • 1140 



1141 tctttcctggggggcctttgtcagggttgcctgtgtcacccacgggccccttggtcacgc 12 00 

1160 • 1180 • 1200 



12 01 ttggctgggtcaccccggagccccgggtcacgtttgccggggtcagccctggggcccttc 12 60 

1220 • 1240 • 1260 



12 61 gtcacgctccattgcccgctccccgccccccttgggggcttggctggcgcggccgggggc 13 2 0 

1280 • 1300 " • " 1320 



1321 



gcgaccgcctctctgtccccgccccctgtcattctaccctccaacggcgtgcaggctccc 
13 4 0 • 13 6 0 • 13 8 0 



1380 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenF/t8ifl^leihfi3t2add^I=> W8£A Parallel 



1381 gcgccgccccccgctcccgcccccaccagacacaccccctgcgaccctgccccctcatca 14 4 0 

1400 • 1420 • 1440 



1441 cggtcacgcatggctgcgcccgcgcggcgccccccgacccccccgatccccggaacgtcc 1500 

1460 ■ 1480 • 1500 



15 01 gccgccatccaccgcccgcagcccccggagcccgcggaccccgaccccccgccgcccgcc 15 6 0 

1520 • 1540 • " 1560 



15 61 cgccgtagcgccgctctgtgcgggggggctggggggccgcgggggaaggccacgccccct 162 0 

1580 • 1600 ■ 1620 



1621 ccactttttccaggaatgcgcggccccccatgccccaccccccgatgccccccgggggtc 1680 

1640 • 1660 • 1680 



16 81 tttcctggggggcctttgtcagggttgcctgtgtcacccacgggccccttggtcacgctt 17 4 0 

1700 ^ • 1720 ^ • 1740 



17 41 ggctgggtcaccccggagccccgggtcacgtttgccggggtcagccctggggcccttcgt 1800 

1760 ~ ~ • 1780 " • " 1800 



1801 cacgctccattgcccgctccccgccccccttgggggcttggctggcgcggccgggggcgc 1860 

1820 . 1840 . i860 



1861 gaccgcctctctgtccccgccccctgtcattctaccctccaacggcgtgcaggctcccgc 192 0 

1880 • 1900 ■ 1920 



1921 gccgccccccgctcccgcccccaccagacacaccccctgcgaccctgccccctcatcacg 1980 

1940 ■ 1960 " • 1980 



1981 gtcacgcatggctgcgcccgcgcggcgccccccgacccccccgatccccggaacgtccgc 2 04 0 

2000 • 2020 • 2040 



2 041 cgccatccaccgcccgcagcccccggagcccgcggaccccgaccccccgccgcccgcccg 2 100 

2060 • 2080 • 2100 



2101 ccgtagcgccgctctgtgcgggggggctggggggccgcgggggaaggccacgccccctcc 2160 

2120 ■ 2140 • 2160 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmen?/i8ifl|>leihS3tMia»^> ItigA Parallel 



2161 actttttccaggaatgcgcggccccccatgccccagcaagccgcagcgactttccgcgcc 22 2 0 

2180 • 2200 " • 2220 



22 21 tgcctcatgacactcgcacagcccacacccttttcgcctgaatccgccacctcattctga 2 2 80 

2240 • 2260 • 2280 



2 2 81 aattcccatatccgccgtctgctgcttcgtcacccgccgacccttagccctcttagccgc 2340 

2300 * 2320 • 2340 



2 341 ctcacccgcctcccctacggttaccccacagccttgcctcacctgaacccccctaaagca 2 4 00 

2360 ■ 2380 • 2400 



24 01 cggcctcccgcctgccgccaacgacctcccaacgttgcgcgccccgcgcctctttgtgca 2460 
: 2420 • 2440 • 2460 



24 61 gattacactgccgcttcccacaacactacgcactcccccttctgattgccgcactgcctt 2 52 0 

2480 2500 • 2520 



2521 tccatttcctgttgcacttggccaccgcattcccacagcttgccccccggggacccgctt 25 80 

2540 " • 2560 " • 2580 



2 581 ttctaacacaaacacacgctttctacttcccctttctacgcttacatgcacacacacacc 2 64 0 

2600 - 2620 • 2640 



2 641 gccgctttcgggaaatctgtacccgtactgcctccggcagaccccgcaaatccccccggg 2 7 00 

2660 • 2680 • 2700 



27 01 cctacatcccaagaaacacgcgttactctgacgtagccgccctacataagcctctcacac 2760 

2720 • 2740 • 2760 



2761 tgctctgcccccttctttcctcaactgccttgctcctgacacactgccctgaggatggaa 2 82 0 

2780 • 2800 • 2820 



2 821 cacgaccttgagaggggcccaccgggcccgcgacggccccctcgaggaccccccctctcc 2 880 

2840 • 2860 • 2880 



2881 



tcttccctaggccttgctctccttctcctcctcttggcgctactgttttggctgtacatc 
2900 • 2920 • 2940 



2940 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenf/t8ift|rteih63t^iRM=> ItigA PSrallel 



2 941 gttatgagtgactggactggaggagccctccttgtcctctattcctttgctctcatgctt 3 0 00 

2960 • 2980 • 3000 



3 0 01 ataattataattttgatcatctttatcttcagaagagaccttctctgtccacttggagcc 3060 

3020 • 3040 • 3060 



3061 ctttgtatactcctactgatgagtaagtattacaccctttgccccacaccccctttccct 312 0 

3080 • 3100 • 3120 



3121 tactcttccttctctaacgcactttctcctctttccccagtcaccctcctgctcatcgct 3180 

3140 • 3160 • 3180 



3181 ctctggaatttgcacggacaggcattgttccttggaattgtgctgttcatcttcgggtgc 32 40 
• ~ 3200 • 3220 • 3240 



32 41 ttacttggtaagatctaacattccctaggaattatttaccacacccccacttttccaacc 3 30 0 

3260 ■ 3280 • 3300 



3 3 01 ctaacactcttttttcaacgcagtcttaggtatctggatctacttattggagatgctctg 3 3 6 0 

3320 • 3340 " 3360 



3361 gcgacttggtgccaccatctggcagcttttggccttcttcctagccttcttcctagacct 342 0 

3380 " • 3400 • 3420 



342 1 catcctgctcattattgctctctatctacaacaaaactggtggac tctattggttgatct 34 80 

3440 • 3460 • 3480 



34 81 cctttggctcctcctgtttctggcgattttaatctggatgtattaccatggacaacgaca 35 4 0 

3500 • 3520 • 3540 



35 41 cagtgatgaacaccaccacgatgactccctcccgcaccctcaacaagctaccgatgattc 3 600 

3560 • 3580 • 3600 



3601 tggccatgaatctgactctaactccaacgagggcagacaccacctgctcgtgagtggagc 3 66 0 

3620 • 3640 • 3660 



36 61 cggcgacggacccccactctgctctcaaaacctaggcgcacctggaggtggtcctgacaa 3 72 0 
• 3680 ■ 3700 • 3720 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmen7/i8Jfl^leih6Bt2diiR^> EtigA Parallel 



3 721 tggcccacaggaccctgacaacactgatgacaatggcccacaggaccctgacaacactga 37 80 

3740 • 3760 " 3780 



37 81 tgacaatggcccacatgacccgctgcctcaggaccctgacaacactgatgacaatggccc 38 4 0 

3800 • 3820 • 3840 



3 841 acaggaccctgacaacactgatgacaatggcccacatgacccgctgcctcatagccctag 3 9 00 

3860 • 3880 • " 3900 

1 catgatggcacgccggct 18 

xMIMil II I 

3 901 cgactctgctggaaatgatggaggccctccacaattgacggaagaggttgaaaacaaagg 3 9 60 

3920 • 3940 ~ 3960 



39 61 aggtgaccagggcccgcctttgatgacagacggaggcggcggtcatagtcatgattccgg 4 02 0 

3980 • 4000 • 4020 



4021 ccatggcggcggtgatccacaccttcctacgctgcttttgggttcttctggttccggtgg 40 80 

4040 • 4060 • 4080 



4081 agatgatgacgacccccacggcccagttcagctaagctactatgactaacctttctttac 4140 

4100 • 4120 « 4140 



4141 ttctaggcattaccatgtcataggcttgcctgactgactctccctccatttactgggaat 42 00 

4160 • 4180 ■ 4200 



4201 gccttagctaatcaccttaactggcacacactcccttagccacactgtctgtctaggctg 42 60 

4220 • 4240 • 4260 



4261 aaaagccacattcatattctatttcaaaacaaggggaaaggaggacatgcgagaattggc 432 0 

4280 • 4300 • 4320 



4 321 agacacctttacccagcccttaacacaccacacaggtagcaaggacccgggcgttgccag 4380 

4340 • 4360 • 4380 



4 381 actccgccaccaacgcccctgcgttgaacccacccctcctacacacatcagacctctgca 4 44 0 

4400 • 4420 • 4440 



4 441 caacacaactaccaggcagatgaggccccttacttccacagggtactggcataccagcgg 4 500 

4460 ■ 4480 • 4500 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragment /tm'p\eAi€5tm&M=> ItfgA Parallel 



4501 gggaccacatacatccctgtctcccacccagtaactccagcaactttgctttccatcttg 45 6 0 

4520 ■ 4540 • 4560 

20 

19 gcccaagcc 2 7 

1 1 1 1 1 i 1 1 1 

4 561 tgccaatacacatttggattcagcccaagccacacctaactcatgccagcagaggcagga 4 62 0 

4580 • 4600 ■ 4620 



4 621 acacctgttgttgacacattctttgcgcataagcactttaatccctctctcacacccaga 4 68 0 

4640 • 4660 • 4680 



4 681 aactaagagctagcccaaaacctccacacctgtcctcgctcatctttccacattcctctg 4 74 0 

4700 • 4720 • 4740 



4741 gccttctttccttgtccttactgtataaaagtccacgaaaacagctgtgcctcactctcg 4 8 00 

4760 • 4780 • 4800 



4 801 agatggtacacgtcctggagcgtgctttgctagagcagcagtcctctgcctgcggcctgc 4 860 

4820 • 4840 • 4860 



4 861 ccggctcttctacggagaccaggcctagccacccctgccccgaggacccagacgtcagca 4 92 0 

4880 • 4900 ■ 4920 



4 921 gactaagactactcctggtggtactctgtgtcctgtttggacttttatgcctgctcctca 4 9 80 

4940 • 4960 • 4980 



4 981 tctaagaagccaccatgcgaccgggtagaccactggctggattctacgctactctccgcc 5 04 0 

5000 ■ 5020 ■ 5040 



5041 gttccttcagaagaatgtccaaaaggtcaaagaacaaggccaagaaggagcgtgtccccg 5100 

5060 • 5080 • 5100 



5101 tggaggac c gccc ac c gac tec gat gcccacc age c age gact gate cgcagaaacgcgt 5160 

5120 • 5140 ■ 5160 



5161 tgggaggaggcgtccgccccgatgcggaggactgcatccaacgcttccaccccctggagc 5 22 0 

5180 • 5200 ■ 5220 



52 2 1 cagcgctgggggtgtcaacaaagaactttgacctgttgtccctgagatgtgaattgggat 5 2 80 
• 5240 • 5260 • 5280 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenP/t8ift^Ieih6St241tfa^> fi^A P8rallel 



52 81 ggtgtggataacatctcccgctagatggcgcccttattattgatgtgacttgtgatgcaa 5 34 0 

5300 ■ 5320 ^ • 5340 



53 41 taaataaaagtacagatagatggcactcttaccttcctctgcccgcttcttcgtatatgt 5 4 00 

5360 " ■ 5380 • 5400 



5401 gttgagatgagtcatcccgtggagagtagggagggggagggagcccgtcattcccgtcgt 54 60 

5420 • 5440 • 5460 



54 61 gttgcaatcccaagtacagacttt gate ttgggt tec tagtggtt gat agtccgagtgac 55 2 0 

5480 • 5500 ~ • ~ 5520 



55 21 ggtcgccattgccccaatatgggtcctcataaggcggtgggggctcttcattagattcac 55 8 0 

5540 ■ 5560 • " 5580 



5581 gttcctcatcgttcggtggggtgggggtgttcccagaagagccagaagcagatggatatt 5 64 0 

5600 • 5620 • 5640 



5 641 gggagttgtttccgccatcgtacccatccggatccccgccggggc tagggggacccgcgc 57 0 0 

5660 • 5680 • 5700 



57 01 ccattggcaccatttctagggaccccatagctgcagcagcgactgcaaaaccggcctgaa 57 60 

5720 • 5740 • 5760 



5761 atgctctctgagaaacaaggcgagagggatttacgccccagcaagcttcaggcaaaataa 582 0 

5780 ■ 5800 • 5820 



5 821 cctgtgatgctgaaaccgcggcgctacaggatcaccccaattggcaagacctgggcggag 5880 
• " 5840 • 5860 • 5880 



5 881 ttcctgagagcccaggggtctcgtgcaggtgtccccggggaattgttctccctgatcacc 5 94 0 

5900 • 5920 • 5940 

28 caccc 32 

Mill 

5 9 41 gcccacccccgttttctccaaaccagcagctgtgacaatcttcacacactgctgctgtca 6 00 0 

5960 • 5980 • " 6000 



6001 cctggaactattttcccacggtgcccttccgcccattttcccacgagtcgcgaggctatc 6060 

6020 • 6040 • 6060 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmenf/t8ift^leih«t24iiiaVI=> Wti£A Parallel 



6061 cacccgcaatgccacccccccaatgccacactaaaacaaggtgaaataggcaagtgcgtt 612 0 

6080 ■ 610 0^ • 6120 



6121 tattgcgacaagtatccagaaacataaaccccgtgggcttcctccttgtcatttttccca 6180 

6140 • 6160 • 6180 



6181 acgcaggtcactggcaggtgccagggcttgggaagtgacaggtcaacagcaacagagagg 62 4 0 

6200 • 6220 • 6240 



62 41 ctcccatccttttccttcataacaccgccatttgccgcagttggtgcgggctccacgccc 6300 

6260 • 6280 • 6300 



63 01 tcgggcatgagccactggacgtggggatggggaaatgcattcacggtgcatgtcacagta 6360 

6320 ■ 6340 ■ 6360 



6361 aggacagagaagtctgggaactgagacctttcggagtggacagacagcgttagaggcttc 64 2 0 

6380 • 6400 -6420 



64 21 accacgctcaggtgttcctgcttggtgacctcggtctcgcccagtttcatgcggcacagg 6480 

6440 • 6460 -6480 



64 81 tagttgccgtcatgggagatgttggcagcggtgactactaaaaagaaggtgttggcactt 65 4 0 

6500 • 6520 • 6540 



654 1 ctgtggatatcaaagaagcccctgaaaggccactctataaagatgacatcgtggtgcatg 6 600 

6560 • 6580 -6600 



6601 cgcccaataagcacctgctcctctcctgggcccagtttaaaccagctgacctcaatctct 6 6 6 0 

6620 • 6640 • 6660 

33 tccag 37 

1 1 1 1 1 

6661 ggaccgaggctcaccctcctccagtaggaggtcagggtgactcgctcacccaagaaagcg 672 0 

6680 • 6700 • 6720 



672 1 gtgacagcctggccggcggccacacaggaggccaacaggaggagctgagcgatgaacctg 6780 
• 6740 • 6760 • 6780 



6781 



gccattgctctggactctcctcacccaggcctcgcgtcttatactattctgccacgccca 
6800 • 6820 • 6840 



6840 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et aL EcoRI Dhet fragmeifl/18yftSl)le<nfiai2id»l=> H*gA Iftrallel 



6841 ttttatcatataagcctgaagcccgtgagctggcctgacgagaccatgaggccagccaag 6 90 0 

6860 • 6880 ■ 6900 



6 901 tctacagattctgtgtttgtgaggaccccggtcgaggcgtgggtcgcgccctcgccgccg 6 96 0 

6920 ■ 6940 • " 6960 



6961 gacgacaaggtggctgagtccagctacctcatgttcagggccatgtacgcggtgttcacc 7 02 0 

6980 ■ 7000 • 7020 



7 021 cgggatgagaaagacctgcctttgccagccctggtcctctgccggctcatcaaggcctcc 7 080 

7040 • 7060 • 7080 



7 081 ctgaggaaggataggaagctgtacgcggagctggcctgcaggacagccgacatcgggggc 7140 

7100 • 7120 " ' • 7140 



7141 aaagacacgcacgtacggctcatcatcagcgtcctgcgcgcagtgtacaacgaccactac 72 0 0 

7160 • 7180 • 7200 



7 201 gactactggtcgcggctcagggtggtgctgtgctacacagtggtgtttgcggtgcgaaac 72 6 0 

7220 • 7240 ■ " 7260 



7261 tacctggatgaccacaagagcgccgccttcgtgctgggggcaatcgcccactacctggcc 7320 

7280 • 7300 ■ 7320 

40 

38 gggaggctggaggcggattt 57 

I I I I I I I I jxxxxxxxxxx 
7 32 1 ctctatcgcagactctggtttgcgaggctgggcggcatgccaaga tcgctgagacgtcag 7 3 80 

7340 • 7360 • 7380 



73 81 ttccccgtgacgtgggccctggccagcctgactgacttcctgaaatctttgtaaatgaat 74 4 0 

7400 • 7420 ■ 7440 



744 1 aaacagtgggtgttgcgtgatgagtaaagtgtaacatttaatgtgggactgggaggccgg 7 500 

7460 • 7480 • " 7500 



7501 ggcgataccttgggcatcatgcagggtgcacagactagcgaggataatctgggcagccag 75 6 0 

7520 • 7540 • 7560 



7561 



agccagccgggtccgtgcggctacatctacttttaccccctggccacctaccctcttagg 
7580 • 7600 " ■ 7620 



7620 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmeifl/ra^leto*ai2idtfM=> H»gA lirallel 



7 621 gaggtggccacactggggaccggctacgcgggccacaggtgcctgacggtgccgctcctt 76 8 0 

7640 • 7660 • 7680 



7 681 tgcggcatcaccgtggagccgggcttcagcatcaatgtcaaggctctgcacaggaggccc 7740 

7700 • 7720 • 7740 



7 7 41 gaccccaactgcgggctcctacgcgctacctcctatcacagggacatctacgtgttccac 7800 
• 7760 • 7780 "~ • 7800 



7 801 aatgcccatatggttccccccatctttgaggggccgggtctcgaggccctctgtggcgag 7 860 

7820 ■ 7840 • * 7860 



7 861 accagggaggtgtttgggtacgacgcctacagcgccctaccgagggaaagctccaagccg 7 92 0 

7880 • 7900 • 7920 



7 921 ggggacttcttccccgaagggctagatccctctgcctacctgggggcggtggcaataacc 7 980 

7940 • 7960 • 7980 



7 981 gaggccttcaaggagcgactctacagcggaaacctggtggccattccatcgttaaaacag 804 0 

8000 • 8020 ■ 8040 



8 041 gaggt age ggtggggcagtctgc gage gttagggtcccgctctacgacaaggaggtgttc 8100 
: ^ " 8060 ■ 8080 • 8100 



8101 ccagagggcgtgccccagctccgccagttttacaactcggacctcagccgctgcatgcac 8160 
• 8120 • 8140 • 8160 



8161 gaggcgctgtacaccgggctggcgcaggcgctgcgcgtccgacgggtgggcaagctggtg 82 2 0 

8180 • 8200 • 8220 



82 21 gagctgctggagaagcagagcctgcaggaccaggccaaggtggccaaggtggcccccctc 82 80 

8240 ■ 8260 ■ 8280 



82 81 aaggagttcccagcctcaaccatcagtcacccggactcgggagccttaatgattgtggac 834 0 
• 8300 • 8320 • 8340 



8341 



agcgcggcatgcgagctggcggtgagctacgcacccgccatgctggaggcctcgcacgag 
8360 • 8380 • 8400 



8400 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmeifl/18yflil)l©ftifiai2id«M=> H*gA Mrallel 

60 

58 tccaga- 63 

MINI 

84 01 accccggccagcctcaactacgactcgtggcccctgtttgccgactgtgagggtccagag 84 60 

8420 • 8440 -8460 



84 61 gcccgtgtggctgcgttacaccgatataatgccagcctggccccccacgtgtccacgcag 852 0 

8480 • 8500 • 8520 



8521 atctttgccaccaattccgtcctctacgtctcgggggtctcgaagtcaaccggtcagggc 85 8 0 

8540 • 8560 • 8580 

64 cagt 67 

I I I I 

85 81 aaggagagtctctttaacagtttctacatgacccacggcctggggaccctgcaggagggg 8 64 0 

8600 ■ 8620 • 8640 

68 cccctgcttc 77 

MINIMI 

8641 acctgggacccctgccgccgaccctgcttctcgggctggggtgggccagacgtgaccgga 8700 

8660 • 8680 • 8700 



8701 accaacggtccgggaaactacgctgtggagcacctggtctatgcggcctccttctcgccc 87 60 

8720 8740 • 8760 



87 61 aaccttcttgcccgctatgcctactacctgcagttttgccagggacagaagagctctctg 882 0 

8780 • 8800 • 8820 



8821 accccggtgccggagacgggcagctacgtggcgggggcggccgccagtcccatgtgctcg 8 8 8 0 

8840 ■ 8860 ■ 8880 



8 881 ctctgcgagggccgggccccggccgtgtgcctgaacacgctcttctttaggctgagggac 894 0 

8900 ■ 8920 • 8940 



89 41 cgcttcccccccgtcatgtccacgcagcggagggacccctatgtgatctcgggggcctcg 9000 

8960 • 8980 • " 9000 



9001 ggctcctacaacgagacggactttttgggcaactttctcaacttcatcgataaggaggac 9 060 

9020 • 9040 • " 9060 
80 • 100 
78 cta-aatttcaag-agctgaaccagaa 102 

Ml I II I Ml I Mill I Ml 

9061 gacgggcagcggccggacgacgagccccgctacacctactggcagctgaaccagaacctg 9120 

9080 • 9100 • 9120 



9121 



ctggagcggctgtctcggctgggcatagacgctgaaggaaagctagagaaggagccccat 
9140 ■ 9160 ■ 9180 



9180 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmerfl/raffifele4rfai2Adf&l:=> larallel 



9181 ggcccgcgtgactttgtcaagatgttcaaggacgtggatgcggcggtggacgccgaagtg 92 4 0 

9200 • 9220 • 9240 



92 41 gtccagtttatgaacagcatggccaagaacaacatcacctacaaggacctggtcaagagc 9 30 0 

9260 • 9280 • 9300 



9301 tgctaccacgtgatgcagtactcgtgcaacccctttgcgcagcccgcctgccccatcttc 9 360 

9320 • 9340 • 9360 

120 

103 taatctccccaatgatgttt 122 

Ml || | | 

93 61 acccagctgttttaccgctcactgctgaccatcctgcaggacatctccctgcccatctgt 94 2 0 

9380 • 9400 • 9420 

123 ttcgg 127 

I I 

94 21 atgtgctatgagaatgacaaccccgggcttggccagagccccccagagtggctaaagggt 94 8 0 

9440 • . 9460 9480 



94 81 cactaccagacgctgtgcaccaactttaggagcctggccatcgacaagggggtcctcacg 95 4 0 

9500 ■ 9520 • 9540 



9541 gccaaggaggccaaggtggtgcatggggagcccacctgcgacctgccagacctggacgcg 96 0 0 
• 9560 • 9580 • 9600 



96 01 gccctgcagggccgggtgtacggccggcggctgcctgtgcgcatgtccaaggtgctgatg 9 66 0 
• 9620 • 9640 • 9660 



9 661 ctgtgccccaggaacatcaagatcaagaacagggtggtcttcacgggggagaatgccgcc 9 72 0 

9680 • 9700 • 9720 



97 21 ctccagaacagcttcatcaagtccactaccaggagggagaactacatcatcaacgggccc 9780 

9740 • 9760 ■ 9780 

. • • . -f 

9781 tacatgaaattcctcaacacctaccacaagaccctattcccggacactaagctctcaagc 9 840 

9800 ■ 9820 • 9840 



9841 ctgtacctgtggcacaacttttccaggcggcgctcggtccctgtccccagcggggccagc 990 0 

9860 • 9880 • 9900 



9 901 gcggaggagtactctgacctggccctctttgtggacgggggctcccgggcccacgaagag 

9920 ' ■ 9940 • 9960 



9960 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmeiff/18^1einSai2*d*M=> H*gA lArallel 

12 8 -gaggctcaaa 13 7 

MINIMI 

9961 agcaacgtcatagatgtggtgcctggcaacctggtcacttacgccaagcagaggctcaac 1002 0 

9980 • 10000 • 10020 

140 • 160 • 180 

13 8 gaagtta-cctg--gtatttctgacat-cccagttctgc-t-a-c--gaag-agt-ac-g 185 

I I I MM I I I I M MMMMI I I I I M M M I 

10021 aacgccatcctgaaggcgtgcggccagacccagttctacatcagcctgattcagggactg 10080 

10040 • 10060 • 10080 

200 

186 -tgcagaggacttttggggt 204 

Ml MIMI Ml 

10081 gtgccgaggacgcagtcggtgcccgcccgtgactacccccacgtactgggcacgcgggcg 1014 0 

10100 • 10120 • 10140 



10141 gtggagtcggcagcggcctacgcggaggccacctcctcccttactgcgaccacggtggtc 102 00 

10160 • 10180 • 10200 



10201 tgcgcggccacagactgtcttagccaggtctgcaaggcccgtccggttgtcacgctgcca 10260 

10220 • 10240 • 10260 



102 61 gtgaccatcaacaagtacacgggggtcaacggcaacaaccagatattccaggccgggaac 1032 0 

10280 • 10300 " • " 10320 



10 321 ctgggatactttatgggccggggcgtggacaggaacctgctgcaggcccccggggctggg 10380 

10340 • 10360 " " • ' 10380 



10381 ctgcgcaagcaggccgggggctcttccatgcggaagaagtttgtctttgccacccccacc 104 4 0 

10400 • 10420 • 10440 



10441 ctagggttgaccgtgaagcgccggacccaagccgcgaccacatatgagattgagaacatc 1050 0 

10460 ■ 10480 - 10500 



10501 agggctggcctggaggccattatatcacaaaaacaggaggaagactgtgtgtttgatgtg 105 60 
• 10520 • 10540 • 10560 



105 61 gtgtgcaaccttgtggatgccatgggcgaggcatgcgcctcgctgactagggacgacgcg 10 62 0 

10580 • 10600 10620 



10 62 1 gagtacttattgggccgcttctccgtcctggcggacagcgtcctagaaaccctggcgacc 10 6 80 

10640 • 10660 • ' 10680 



10681 attgcctccagcgggatagagtggacggcggaggccgctcgggactttctggagggagtg 

10700 • 10720 • " 10740 



10740 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmeifl/l«ffil)lefaSai2idiM=> ffi^A ISrallel 



10741 tggggtgggcccggggcagcccaggacaactttatcagcgtggccgagccggtcagcacc 10800 

10760 • 10780 • 10800 

205 gcctcggc 212 

Mill 

10801 gcgtcgcaggcctcggccgggctgctgctgggtggaggagggcagggctccgggggcaga 10860 

10820 • 10840 -10860 



10861 cgcaagcgccgtctggccaccgttctccccggactcgaggtctagagacccctggggcgg 1092 0 

10880 • 10900 • 10920 



10 921 c gat gt c ggggc tgctggcggcggcgtacagccaggtgtacgccctggcggtt gage tga 10980 

10940 ■ 10960 • 10980 



10981 gcgtgtgcacccggctggacccccggagtctggacgtggctgcggtggtgcgcaacgccg 11040 

11000 • 11020 • 11040 



11041 gcctgctggccgagctggaggccatcctccttccccgtttgagacggcagaatgaccgtg 11100 

11060 • 11080 • " 11100 



11101 catgcagcgccctgtccctggagctggtgcacctgctagagaactcgagagaggcctctg 11160 

11120 • 11140 • ~ 11160 



11161 ccgcgctgctcgcccctggtagaaagggtacccgggtcccgcctctccgtaccccctcag 11220 

11180 • 11200 • 11220 



112 2 1 tcgcgtactctgtggagttttacggggggcataaagtcgatgtaagtttgtgcctaataa 112 80 

11240 • 11260 • 11280 



112 81 atgacatagagattttaatgaagagaatcaatagcgtgttttattgcatgtctcacacca 11340 

11300 • 11320 • 11340 



11341 tggggctggagagcctggaacgggccctggatctgctgggccgctttc ggggc gtaagtc 114 00 

11360 11380 • 11400 



114 01 cc atccc agaccc gc gcctctac ate acctctgtgccctgctggcgctgtgtgggc gage 11460 

11420 • 11440 • 11460 



114 61 tgatggttctgcccaaccacggcaacccttccacggcagaggggacccacgtctcctgta 1152 0 

11480 • 11500 • 11520 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmeriI/ifi/flSt)le*ifiai2jld*M=> HigA Mrallel 



11521 accacctggcggtgccggtgaatccggagccggtctcgggactgtttgagaatgaagtcc 1158 0 

11540 • 11560 • 11580 



115 81 gccaggcggggctcgggcacctgttggaggctgaggagaaggcgaggccgggcggcccag 1164 0 

11600 • 11620 ■ • 11640 

220 * 240 

213 gccaacgcgc-catagacaagaggcagagagccagtgtggctggggct 259 

II III II III 1 1 MM ' " 

11641 aggagggcgcggtcccgggccc ggggcggccggaggcagag ============= ====== 11681 

11660 • 11680 

260 • 280 • 300 

2 60 ggtgc teat gcacaccttggcgggtc ate cgccacccccgtcc age aggctcaggccgcc 319 



320 • 340 • 360 

32 0 gcatccgctgggaccggggccttggcatcatcagcgccgtccacggccgtagcccagtcc 37 9 



380 • 400 • 420 

380 gcgaccccctctgtttcttcatctattagcagcctccgggccgcgacttcgggggcgac- 438 

III II Ml 

11682 = = = = = = == = = = = = = = = = « = = = = = = = = ==== = = == = = = = = = = = = = = == = = ==== = = ==== = gggg C gaCC 11690 



11691 agagcgctggacacctacaacgtcttctcgacagtgcccccggaggtggcggagctctcg 11750 
11700 • 11720 ■ 11740 



11751 gage tec tctattggaactctggcggccatgct ate ggtgc aacggggcagggggagggt 11810 
11760 • 11780 • 11800 ~ • 



11811 ggcggccattcccgcctctctgccctgtttgcccgggagcgtcgcctggccctggtgcgg 11870 
11820 • 11840 - 11860 



11871 ggggcctgcgaggaggcgctggcgggggcaaggctgactcacctgtttgacgccgtggct 11930 
11880 • 11900 • 11920 • 



11931 cccggggccacggagcggctcttctgcggcggggtctacagctcctcgggcgacgcggtg 11990 
11940 " • 11960 ■ 11980 " " : 
440 • 460 
439 tgccgccgcctccgccgccgcagccgtcgataccggg 475 

II,! I I I II Ml II I MIMM 

119 91 gaggcgctgaaggcggactgcgccgccgccttcacggc=gcacccccag=taccgggcca 1204 8 
12000 • 12020 • 12040 



12 04 9 tcctgcaaaagaggaacgagctgtacacgcggctcaaccgagccatgcagcggttgggcc 
12060 • 12080 • 12100 ' 



12108 



9310-13DVCTDV SEQ ID NO l.xdna x Bankier et al. EcoRI Dhet fragmeriI/l»flii)l^aSai24dffM=> ffigA Mralle! 



12109 gaggcgaggaggaggcgtcccgggagagcccggaggtgccccggccggctggggcacgag 12168 
12120 ^ * • 12140 ■ 12160 

476 tc 477 

I 

12169 agcccggcccgtccggcgccctctcggacgcgctcaagcgcaaggagcagtacctgcgcc 12 22 8 
12180 ■ 12200 • 12220 

480 

478 aggtggc 484 

I I I I I I I 

122 29 aggtggccaccgagggtctggccaagctgcagtcctgcctggcgcaacagagcgagaccc 12 2 88 
12240 • 12260 12280 

500 

485 gggggacaacccc-ac-gacacc-g-ccc-ca 511 

lllllll I II I I I I Ml II 

122 89 tgaccgagaccctgtgcctgcgcgtctggggggacgtggtctactgggagctggcccgca 12 34 8 
12300 • 12320 • 12340 

520 

512 cgcggggc-acgtaagaaacagtagccc 538 

Ml I II I MM I I 

12 34 9 tgcgcaaccacttcctctacagacgggccttcgtctcgggtccctgggaggacaggcgcg 12 4 08 
12360 • 12380 • 12400 



12 4 09 ccggcgagggtgccgcctttgagaattc 12 4 36 

12420 

% Identity = 1.9 (243/12628) 

/// 



### DNA Strider 1.4f6 ### Friday, July 13, 2007 5:22:43 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmentxdna => DNA Parallel 

DNA sequence 1038 bp atgctatcaggt . . . cgcgtggcttga linear 

DNA sequence 12436 bp gaattctcaaag . . . tgtttagaattc linear 



Method: Blocks (Martinez) 

Layout : Standard 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Translation: Off 



Alignment 9 . Comparison of nucleotide sequence 
of SEQ ID NO: 3 with the nucleotide sequence of 
Fig. 2 of Bankier et al 



1 gaattctcaaaggcggcaccctcgccggcgcgcctgtcctcccagggacccgagacgaag 6 0 

20 • 40 • 60 



6 1 gcccgtctgtagaggaagtggttgcgcatgcgggccagctcccagtagaccacgtccccc 12 0 
• 80 • 100 • 120 



121 cagacgcgcaggcacagggtctcggtcagggtctcgctctgttgcgccaggcaggactgc 180 

140 -160 • 180 



181 age tt ggcc agaccc t c ggtggccacctggcgcaggtactgctccttgcgctt gage gcg 24 0 

200 • 220 • 240 



2 41 tccgagagggcgccggacgggccgggctctcgtgccccagccggccggggcacctccggg 3 00 
• " 260 • 280 • 300 



301 ctctcccgggacgcctcctcctcgcctcggcccaaccgctgcatggctcggttgagccgc 36 0 

320 • 340 • 360 



361 gtgtacagctcgttcctcttttgcaggatggcccggtactgggggtgcgccgtgaaggcg 42 0 

380 • 400 - 420 



421 gcggcgcagtccgccttcagcgcctccaccgcgtcgcccgaggagctgtagaccccgccg 4 80 
• 440 • 460 • 480 



481 cagaagagccgctccgtggccccgggagccacggcgtcaaacaggtgagtcagccttgcc 54 0 

500 • 520 • 540 



541 cccgccagcgcctcctcgcaggccccccgcaccagggccaggcgacgctcccgggcaaac 6 00 

560 " • 580 • 600 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmen<7flffl§7=26:BW4» WdallePage 



601 agggcagagaggcgggaatggccgccaccctccccctgccccgttgcaccgatagcatgg 660 

620 ■ 640 ■ 660 



661 ccgccagagttccaatagaggagctccgagagctccgccacctccgggggcactgtcgag 720 

680 • 700 ■ 720 



721 aagacgttgtaggtgtccagcgctctggtcgccccctctgcctccggccgccccgggccc 7 80 

740 • 760 780 



7 81 gggaccgcgccctcctctgggccgcccggcctcgccttctcctcagcctccaacaggtgc 8 40 

800 • 820 • 840 



841 ccgagccccgcctggcggacttcattctcaaacagtcccgagaccggctccggattcacc 900 

860 • 880 • 900 



901 ggcaccgccaggtggttacaggagacgtgggtcccctctgccgtggaagggttgccgtgg 9 60 

920 • 940 • 960 



961 ttgggcagaaccatcagctcgcccacacagcgccagcagggcacagaggtgatgtagagg 102 0 

980 • 1000 • " 1020 



10 21 cgcgggtctgggatgggacttacgccccgaaagcggcccagcagatccagggcccgttcc 10 80 

1040 ■ 1060 • 1080 



1081 aggctctccagccccatggtgtgagacatgcaataaaacacgctattgattctcttcatt 114 0 

1100 1120 • 1140 



1141 aaaatctctatgtcatttattaggcacaaacttacatcgactttatgccccccgtaaaac 12 00 
• " 1160 • 1180 • " 1200 



12 01 tccacagagtacgcgactgagggggtacggagaggcgggacccgggtaccctttctacca 12 60 

1220 • 1240 • 1260 



12 61 ggggcgagcagcgcggcagaggcctctctcgagttctctagcaggtgcaccagctccagg 132 0 

1280 • 1300 * • 1320 



132 1 gacagggcgctgcatgcacggtcattctgccgtctcaaacggggaaggaggatggcctcc 13 80 

1340 • 1360 " • ~ 1380 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmen(Mffli7=;6:BNA3 MtiallePage 



13 81 agctcggccagcaggccggcgttgcgcaccaccgcagccacgtccagactccgggggtcc 1440 

1400 • 1420 -1440 



144 1 agccgggtgcacacgctcagctcaaccgccagggcgtacacctggctgtacgccgccgcc 15 00 

1460 • 1480 - 1500 



15 01 agcagccccgacatcgccgccccaggggtctctagacctcgagtccggggagaacggtgg 156 0 

1520 • 1540 " • 1560 



15 61 ccagacggcgcttgcgtctgcccccggagccctgccctcctccacccagcagcagcccgg 162 0 

1580 • 1600 • ~ " 1620 



1621 ccgaggcctgcgacgcggtgctgaccggctcggccacgctgataaagttgtcctgggctg 168 0 

1640 • 1660 • 1680 



1681 ccccgggcccaccccacactccctccagaaagtcccgagcggcctccgccgtccactcta 174 0 

1700 ■ 1720 • 1740 



17 41 tcccgctggaggcaatggtcgccagggtttctaggacgctgtccgccaggacggagaagc 18 0 0 
• 17 6 0 • 17 8 0 • 18 0 0 



1801 ggcccaataagtactccgcgtcgtccctagtcagcgaggcgcatgcctcgcccatggcat 18 60 

1820 • 1840 " ■ 1860 



1861 ccacaaggttgcacaccacatcaaacacacagtcttcctcctgtttttgtgatataatgg 192 0 

1880 • 1900 • 1920 



1921 cctccaggccagccctgatgttctcaatctcatatgtggtcgcggcttgggtccggcgct 19 80 

1940 • 1960 ^" • 1980 



1981 tcacggtcaaccctagggtgggggtggcaaagacaaacttcttccgcatggaagagcccc 2 04 0 

2000 • 2020 • ' 2040 



2 041 cggcctgcttgcgcagcccagccccgggggcctgcagcaggttcctgtccacgccccggc 210 0 

2060 • 2080 • 2100 



2101 



ccataaagtatcccaggttcccggcctggaatatctggttgttgccgttgacccccgtgt 
2120 ■ 2140 • 2160 



2160 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenCflffl§7=^6:I9N40 WdallePage 



2161 acttgttgatggtcactggcagcgtgacaaccggacgggccttgcagacctggctaagac 22 2 0 

2180 • 2200 • 2220 



2221 agtctgtggccgcgcagaccaccgtggtcgcagtaagggaggaggtggcctccgcgtagg 22 8 0 
• 2240 ■ 2260 • 2280 



22 81 ccgctgccgactccaccgcccgcgtgcccagtacgtgggggtagtcacgggcgggcaccg 23 40 

2300 • 2320 • 2340 

1 „; atgctatcag 10 

xxxxx | | | ! 1 

2 341 actgcgtcctcggcaccagtccctgaatcaggctgatgtagaactgggtctggccgcacg 24 00 
• 2360 " • 2380 • "~ 2400 



2 401 ccttcaggatggcgttgttgagcctctgcttggcgtaagtgaccaggttgccaggcacca 24 60 
• 2420 • 2440 • 2460 



2461 catctatgacgttgctctcttcgtgggcccgggagcccccgtccacaaagagggccaggt 25 2 0 

2480 • 2500 • 2520 



2 521 cagagtac tec tec gcgctggccccgctggggacagggacc gage gccgcctggaaaagt 25 80 

2540 • 2560 • 2580 



25 81 tgtgccacaggtacaggcttgagagcttagtgtccgggaatagggtcttgtggtaggtgt 2 64 0 
• 2600 2620 • 2640 



2641 tgaggaatttcatgtagggcccgttgatgatgtagttctccctcctggtagtggacttga 2 7 00 

2660 • 2680 • 2700 



27 01 tgaagctgttctggagggcggcattctcccccgtgaagaccaccctgttcttgatcttga 2 7 60 

2720 • 2740 • 2760 



2 761 tgttcctggggcacagcatcagcaccttggacatgcgcacaggcagccgccggccgtaca 2 82 0 

2780 • 2800 • 2820 



2 82 1 cccggccctgcagggccgcgtccaggtctggcaggtcgcaggtgggctccccatgcacca 2 880 

2840 • 2860 • 2880 



2881 ccttggcctccttggccgtgaggacccccttgtcgatggccaggctcctaaagttggtgc 2 94 0 

2 9 0 0 ■ 2920 • 2940 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et aL EcoRI Dhet fragmenffflffl§7=;S:BW43 WlfallePage 



2941 acagcgtctggtagtgaccctttagccactctggggggctctggccaagcccggggttgt 30 0 0 

2960 ■ 2980 ■ 3000 



3 0 01 cattctcatagcacatacagatgggcagggagatgtcctgcaggatggtcagcagtgagc 3 060 

3020 . 3040 • 3060 



3061 ggtaaaacagctgggtgaagatggggcaggcgggctgcgcaaaggggttgcacgagtact 312 0 

3080 " " • 3100 " • 3120 



3121 gcatcacgtggtagcagctcttgaccaggtccttgtaggtgatgttgttcttggccatgc 3180 

3140 • 3160 • 3180 



3181 tgttcataaactggaccacttcggcgtccaccgccgcatccacgtccttgaacatcttga 32 40 

3200 • 3220 • 3240 



32 41 caaagtcacgcgggccatggggctccttctctagctttccttcagcgtctatgcccagcc 3300 

3260 • 3280 • 3300 



33 01 gagacagccgctccagcaggttctggttcagctgccagtaggtgtagcggggctcgtcgt 33 6 0 

3320 " • " 3340 • 3360 



33 61 ccggccgctgcccgtcgtcctccttatcgatgaagttgagaaagttgcccaaaaagtccg 34 2 0 

3380 • 3400 " ~ • 3420 



34 21 tctcgttgtaggagcccgaggcccccgagatcacataggggtccctccgctgcgtggaca 34 80 
• 3440 • 3460 • 3480 



34 81 tgacgggggggaagcggtccctcagcctaaagaagagcgtgttcaggcacacggccgggg 35 4 0 

3500 ■ 3520 • 3540 



35 41 cccggccctcgcagagcgagcacatgggactggcggccgcccccgccacgtagctgcccg 36 00 

3560 ■ 3580 • 3600 



3 601 tctccggcaccggggtcagagagctcttctgtccctggcaaaactgcaggtagtaggcat 3660 
• 3620 • 3640 • 3660 



3661 



agcgggcaagaaggttgggcgagaaggaggccgcatagaccaggtgctccacagcgtagt 
: 3680 " • 3700 • ^ 3720 



3720 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenG*ffi«7=:j6:BN£J MdailePage 



3 721 ttcccggaccgttggttccggtcacgtctggcccaccccagcccgagaagcagggtcggc 3780 

3740 ■ 3760 • " 3780 



3781 ggcaggggtcccaggtcccctcctgcagggtccccaggccgtgggtcatgtagaaactgt 3 84 0 

3800 • 3820 • 3840 



3841 taaagagactctccttgccctgaccggttgacttcgagacccccgagacgtagaggacgg 3900 

3860 3880 : 3900 

11 gta 13 

III 

3 901 aattggtggcaaagatctgcgtggacacgtggggggccaggctggcattatatcggtgta 3 9 60 

3920 • 3940 ■ 3960 

14 acgc • 17 

INI 

3 961 acgcagccacacgggcctctggaccctcacagtcggcaaacaggggccacgagtcgtagt 4 02 0 

3980 * 4000 • 4020 



4 021 tgaggctggccggggtctcgtgcgaggcctccagcatggcgggtgcgtagctcaccgcca 4 0 80 

4040 • 4060 4080 



4 081 gctcgcatgccgcgctgtccacaatcattaaggctcccgagtccgggtgactgatggttg 414 0 

4100 • 4120 " " ~ • ' 4140 



4141 aggctgggaactccttgaggggggccaccttggccaccttggcctggtcctgcaggctct 4 2 00 

4160 • 4180 • 4200 



42 01 gcttctccagcagctccaccagcttgcccacccgtcggacgcgcagcgcctgcgccagcc 4 2 60 

4220 ■ 4240 ~ • 4260 



42 61 cggtgtacagcgcctcgtgcatgcagcggctgaggtccgagttgtaaaactggcggagct 4 32 0 

4280 • 4300 * 4320 



4 321 ggggcacgccctctgggaacacctccttgtcgtagagcgggaccc taacgctcgcagact 4380 

4340 • 4360 • 4380 



4 381 gccccaccgctacctcctgttttaacgatggaatggccaccaggtttccgctgtagagtc 4 4 40 

4400 • 4420 • 4440 



44 41 gctccttgaaggcctcggttattgccaccgcccccaggtaggcagagggatctagccctt 4 5 00 

4460 • 4480 • 4500 
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45 01 cggggaagaagtcccccggcttggagctttccctcggtagggcgctgtaggcgtcgtacc 45 60 

4520 ■ 4540 • 4560 



45 61 caaacacctccctggtctcgccacagagggcctcgagacccggcccctcaaagatggggg 4620 

4580 • 4600 • 4620 



462 1 gaaccatatgggcattgtggaacacgtagatgtccctgtgataggaggtagcgcgtagga 4 680 

4640 • 4660 ; 4680 



4681 gcccgcagttggggtcgggcctcctgtgcagagccttgacattgatgctgaagcccggct 47 40 
• 4700 • 4720 • 4740 



4741 ccacggtgatgccgcaaaggagcggcaccgtcaggcacctgtggcccgcgtagccggtcc 4 8 00 

4760 ~ • 4780 ~ • 4800 



4801 ccagtgtggccacctccctaagagggtaggtggccagggggtaaaagtagatgtagccgc 4 8 60 

4820 ■ 4840 • 4860 



4 861 acggacccggctggctctggctgcccagattatcctcgctagtctgtgcaccctgcatga 4920 

4880 • 4900 • 4920 



4 921 tgcccaaggtatcgccccggcctcccagtcccacattaaatgttacactttactcatcac 49 8 0 

4940 • 4960 . • 4980 



4 981 gcaacacccactgtttattcatttacaaagatttcaggaagtcagtcaggctggccaggg 5 04 0 

5000 • 5020 • 5040 



5041 cccacgtcacggggaactgacgtctcagcgatcttggcatgccgcccagcctcgcaaacc 5100 

50 6 0 • 5 0 8 0 • 510 0 



5101 agagtctgcgatagagggccaggtagtgggcgattgcccccagcacgaaggcggcgctct 5160 

5120 • 5140 • 5160 



5161 tgtggtcatccaggtagtttcgcaccgcaaacaccactgtgtagcacagcaccaccctga 52 2 0 

5180 • 5200 • 5220 



52 21 gccgcgaccagtagtcgtagtggtcgttgtacactgcgcgcaggacgctgatgatgagcc 52 8 0 

5240 • 5260 • 5280 
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5281 gtacgtgcgtgtctttgcccccgatgtcggctgtcctgcaggccagctccgcgtacagct 534 0 

5300 • 5320 • 5340 



5 341 tcctatccttcctcagggaggccttgatgagccggcagaggaccagggctggcaaaggca 5400 

5360 • 5380 " • 5400 



54 01 ggtctttctcatcccgggtgaacaccgcgtacatggccctgaacatgaggtagctggact 5 4 60 

5420 • 5440 • 5460 



54 61 cagccaccttgtcgtccggcggcgagggcgcgacccacgcctcgaccggggtcctcacaa 552 0 

5480 " " . 5500 ' . 5520 



5521 acacagaatctgtagacttggctggcctcatggtctcgtcaggccagctcacgggcttca 5580 
• 5540 ■ 5560 " ' ■ ~ 5580 



55 81 ggcttatatgataaaatgggcgtggcagaatagtataagacgcgaggcctgggtgaggag 5 64 0 

5600 • 5620 • 5640 



5 641 agtccagagcaatggccaggttcatcgctcagctcctcctgttggcctcctgtgtggccg 5 7 00 

5660 • 5680 " • 5700 



57 01 ccggccaggctgtcaccgctttcttgggtgagcgagtcaccctgacc tec tact ggagga 5760 

5720 ■ 5740 - 5760 



57 61 gggtgagcctcggtccagagattgaggtcagctggtttaaactgggcccaggagaggagc 5 82 0 

5780 • 5800 • ' 5820 



58 21 aggtgcttattgggcgcatgcaccacgatgtcatctttatagagtggcctttcaggggct 5 8 80 

5840 ■ 5860 • 5880 



5 881 tctttgatatccacagaagtgccaacaccttctttttagtagtcaccgctgccaacatct 5940 

5900 ■ 5920 ■ 5940 



59 41 cccatgacggcaactacctgtgccgcatgaaactgggcgagaccgaggtcaccaagcagg 60 0 0 

5960 ■ 5980 ~ ■ 6000 



60 01 aacacctgagcgtggtgaagcctctaacgctgtctgtccactccgaaaggtctcagttcc 6 0 60 
• 6020 • 6040 • 6060 
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6061 cagacttctctgtccttactgtgacatgcac.cgtgaatgcatttccccatccccacgtcc 612 0 

6080 • 6100 • 6120 



6121 agtggctcatgcccgagggcgtggagcccgcaccaactgcggcaaatggcggtgttatga 6180 

614 0 ■ 6160 '6180 



6181 aggaaaaggatgggagcctctctgttgctgttgacctgtcacttcccaagccctggcacc 62 4 0 

6200 • 6220 • 6240 



62 41 tgccagtgacctgcgttgggaaaaatgacaaggaggaagcccacggggtttatgtttctg 63 0 0 

6260 ■ 6280 . " 6300 



6301 gatacttgtcgcaataaacgcacttgcctatttcaccttgttttagtgtggcattggggg 6360 

6320 • 6340 ' • 6360 



63 61 ggtggcattgcgggtggatagcctcgcgactcgtgggaaaatgggcggaagggcaccgtg 64 2 0 

6380 ■ 6400 • 6420 



6421 ggaaaatagttccaggtgacagcagcagtgtgtgaagattgtcacagctgctggtttgga 64 8 0 

6440 6460 6480 



64 81 gaaaacgggggtgggcggtgatcagggagaacaattccccggggacacctgcacgagacc 654 0 

6500 • 6520 • 6540 



65 41 cctgggctctcaggaactccgcccaggtcttgccaattggggtgatcctgtagcgccgcg 6 60 0 

6560 • 6580 • 6600 



6 601 gtttcagcatcacaggttattttgcctgaagcttgctggggcgtaaatccctctcgcctt 66 6 0 

6620 • 6640 • 6660 



6661 gtttctcagagagcatttcaggccggttttgcagtcgctgctgcagctatggggtcccta 672 0 

6680 ■ 6700 ■ ^ 6720 



6721 gaaatggtgccaatgggcgcgggtccccctagccccggcggggatccggatgggtacgat 6780 

6740 • 6760 • * " " 6780 



67 81 ggcggaaacaactcccaatatccatctgcttctggctcttctgggaacacccccacccca 6 84 0 

6800 ■ 6820 • 6840 
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6 841 ccgaacgatgaggaacgtgaatctaatgaagagcccccaccgccttatgaggacccatat 6 900 

6860 • 6880 • 6900 



6 901 tggggcaatggcgaccgtcactcggactatcaaccactaggaacccaagatcaaagtctg 6960 

6920 ■ 6940 • 6960 



6 961 tacttgggattgcaacacgacgggaatgacgggctccctccccctccctactctccacgg 7 02 0 

6980 • 7000 • 7020 



7 021 gatgactcatctcaacacatatacgaagaagcgggcagaggaaggtaagagtgccatcta 7 0 80 

' 7040 • 7060 • 7080 



7 081 tctgtacttttatttattgcatcacaagtcacatcaataataagggcgccatctagcggg 714 0 

7100 • 7120 • 7140 



7141 agatgttatccacaccatcccaattcacatctcagggacaacaggtcaaagttctttgtt 72 00 

7160 • 7180 • 7200 



72 01 gacacccccagcgctggctccagggggtggaagcgttggatgcagtcctccgcatcgggg 72 60 

7220 • 7240 • 7260 



7261 cggacgcctcctcccaacgcgtttctgcggatcagtcgctggctggtgggcatcggagtc 7 32 0 

7280 7300 7320 



7 321 ggtgggcggtcctccacggggacacgctccttcttggccttgttctttgaccttttggac 7 380 

7340 • 7360 • 7380 



73 81 attcttctgaaggaacggcggagagtagcgtagaatccagccagtggtctacccggtcgc 7440 

7400 • 7420 • 7440 



7441 atggtggcttcttagatgaggagcaggcataaaagtccaaacaggacacagagtaccacc 7 50 0 
• * 7460 • 7480 • 7500 



7501 aggagtagtcttagtctgctgacgtctgggtcctcggggcaggggtggctaggcctggtc 7 56 0 

7520 • 7540 • 7560 



75 61 tccgtagaagagccgggcaggccgcaggcagaggactgctgctctagcaaagcacgctcc 

7580 ■ 7600 • 7620 



7620 
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7 621 aggacgtgtaccatctcgagagtgaggcacagctgttttcgtggacttttatacagtaag 76 8 0 

7640 • 7660 • 7680 



7 681 gacaaggaaagaaggccagaggaatgtggaaagatgagcgaggacaggtgtggaggtttt 77 4 0 

7700 • 7720 • 7740 



7741 gggctagctcttagtttctgggtgtgagagagggattaaagtgcttatgcgcaaagaatg 7 800 

7760 • 7780 • 7800 



7 801 tgtcaacaacaggtgttcctgcctctgctggcatgagttaggtgtggcttgggctgaatc 7860 

7820 • 7840 ■ 7860 



7 861 caaatgtgtattggcacaagatggaaagcaaagttgctggagttactgggtgggagacag 7920 

7880 ■ 7900 • 7920 



7 921 ggatgtatgtggtcccccgctggtatgccagtaccctgtggaagtaaggggcctcatctg 79 80 

7940 ■ 7960 7980 



7 981 cctggtagttgtgttgtgcagaggtctgatgtgtgtaggaggggtgggttcaacgcaggg 804 0 

8000 • 8020 • 8040 



8 041 gcgttggtggcggagtctggcaacgcccgggtccttgctacctgtgtggtgtgttaaggg 8100 
• 8060 • 8080 • 8100 



8101 ctgggtaaaggtgtctgccaattctcgcatgtcctcctttccccttgttttgaaatagaa 8160 

8120 ■ 8140 ■ 8160 



8161 tatgaatgtggcttttcagcctagacagacagtgtggctaagggagtgtgtgccagttaa 82 2 0 

8180 • 8200 • 8220 



82 21 ggtgattagctaaggcattcccagtaaatggagggagagtcagtcaggcaagcctatgac 82 8 0 

8240 • 8260 • 8280 



82 81 atggtaatgcctagaagtaaagaaaggttagtcatagtagcttagctgaactgggccgtg 83 4 0 

8300 ■ 8320 • 8340 



8341 ggggtcgtcatcatctccaccggaaccagaagaacccaaaagcagcgtaggaaggtgtgg 84 0 0 

8360 • 8380 • 8400 
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84 01 atcaccgccgccatggccggaatcatgactatgaccgccgcctccgtctgtcatcaaagg 8 4 60 

8420 • 8440 • 8460 



8461 cgggccctggtcacctcctttgttttcaacctcttccgtcaattgtggagggcctccatc 85 2 0 

8480 • 8500 • 8520 



8521 atttccagcagagtcgctagggctatgaggcagcgggtcatgtgggccattgtcatcagt 85 8 0 

8540 • 8560 • 8580 



85 81 gttgtcagggtcctgtgggccattgtcatcagtgttgtcagggtcctgaggcagcgggtc 8 64 0 

8600 • 8620 • 8640 



8641 atgtgggccattgtcatcagtgttgtcagggtcctgtgggccattgtcatcagtgttgtc 8 7 00 

8660 • • 8680 • 8700 



87 01 agggtcctgtgggccattgtcaggaccacctccaggtgcgcctaggttttgagagcagag 8 7 60 

8720 • 8740 • 8760 



87 61 tggggg tec gtcgccggc tec actcac gage aggtggtgtctgccctcgttggagttaga 8 82 0 

8780 ■ 8800 • 8820 



8 821 gtcagattcatggccagaatcatcggtagcttgttgagggtgcgggagggagtcatcgtg 888 0 

8840 • 8860 8880 



88 81 gtggtgttcatcactgtgtcgttgtccatggtaatacatccagattaaaatcgccagaaa 8 94 0 

8900 • 8920 • 8940 



89 41 caggaggagccaaaggagatcaaccaatagagtccaccagttttgttgtagatagagagc 9 000 

8960 • 8980 • 9000 



90 01 aataatgagcaggatgaggtctaggaagaaggctaggaagaaggccaaaagctgccagat 9 06 0 
• ^ 9020 • 9040 • 9060 



90 61 ggtggcaccaagtcgccagagcatctccaataagtagatccagatacctaagactgcgtt 912 0 

9080 • 9100 • 9120 



912 1 gaaaaaagagtgttagggttggaaaagtgggggtgtggtaaataattcctagggaatgtt 9180 

9140 • 9160 • 9180 
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9181 agatcttaccaagtaagcacccgaagatgaacagcacaattccaaggaacaatgcctgtc 92 4 0 

9200 • 9220 ■ 9240 



92 41 cgtgcaaattccagagagcgatgagcaggagggtgactggggaaagaggagaaagtgcgt 9300 

9260 • 9280 • 9300 



-9301 tagagaaggaagagtaagggaaagggggtgtggggcaaagggtgtaatacttactcatca 9360 

9320 • 9340 • 9360 



9361 gtaggagtatacaaagggctccaagtggacagagaaggtctcttctgaagataaagatga 942 0 

9380 • 9400 • 9420 



94 21 tcaaaattataattataagcatgagagcaaaggaatagaggacaaggagggctcctccag 94 80 

9440 • 9460 • 9480 

20 

18 aggagaaggag-- 28 

IIMIIIIIII 

94 81 tccagtcactcataacgatgtacagccaaaacagtagcgccaagaggaggagaaggagag 95 4 0 

9500 • 9520 • 9540 



95 41 caaggcctagggaagaggagagggggggtcctcgagggggccgtcgcgggcccggtgggc 9 600 

9560 • 9580 9600 

40 

29 caacagcctgcggagg 44 

Ml II I 1 1 1 1 

9 601 ccctctcaaggtcgtgttccatcctcagggcagtgtgtcaggagcaag=gca=gttgagg 9 65 8 

9620 ■• ~ 9640 • 

60 • 80 100 

4 5 ttcggccgccgcgggccaggacctcatcagcgtcccccgcaacacctttatgacactgct. 104 

I I M I mi i I II I I " " 

9 65 9 aaagaagggggcagagcagtg==tgagaggcttatgtag===================== 9695 

9660 • 9680 

120 • 140 • 160 

105 tcagaccaacctggacaacaaaccgccgaggcagaccccgctaccctacgcggccccgct 164 



180 • 200 • 220 

165 gccccccttttcccaccaggcaatagccaccgcgccttcctacggtcctggggccggagc 224 

240 ■ 260 ■ 280 

225 ggtcgccccggccggcggctactttacctccccaggaggttactacgccgggcccgcggg 2 84 

IIMIIIIIII I I I II I I I Mill 

9 696 =============ggcggctacgtcagagtaacgcgtgtttcttgggatgtaggcccggg 97 4 2 

9700 • 9720 • 9740 

300 • 320 • 340 

2 85 cggggacccgggtgccttcttggcgatggacgctcacacctaccacccccacccacaccc 344 

ii . • . 

9 7 4 3 gggat ===== — — == = = ==== ==— — — = — = = == ==== === = = = = = === — = — — = — = = === == = 9 7 4 7 
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360 • 380 • 400 

345 ccctccggcctactttggcttgccgggcctctttggcccccctccaccgtgcctccttac 4 04 



420 

405 tacggattcccacttgcggg 424 

1 1 1 1 M i 

9 74 8 =============ttgcggggtctgccggaggcagtacgggtacagatttcccgaaagcg 97 94 

9760 • 9780 



9 7 95 gcggtgtgtgtgtgcatgtaagcgtagaaaggggaagtagaaagcgtgtgtttgtgttag 9 85 4 
9800 • 9820 • 9840 

440 • 460 • 480 

4 25 cagactacgtccccg-ctccctcgcgatccaacaagcgg-aaaagagaccccgaggagga 4 82 

II lllllll I II I I I MM III Mill 

9 8 55 aaaagcgggtccccggggggcaagctgtggga=atgcggtggccaagtgcaacagga=aa 9 912 
9860 9880 • 9900 

500 • 520 • 540 

4 8 3 tgaagaaggcggggggctat-tcccgggggaggacgccaccctctaccgcaaggacatag 541 

II I I I I I MM I lllllll 

9 913 tggaaaggcagtgcggcaatcagaagggggag=====—===— ================ 994 4 

9920 • 9940 

560 • 580 • 600 

542 cgggcctctccaagagtgtgaatgagttacagcacacgctacaggccctgcgccgggaga 601 



620 • 640 ■ 660 

6 02 cgctgtcctacggccacaccggagtcggatactgcccccagcagggcccctgctacaccc 661 



680 • 700 • 720 

6 62 actcggggccttacggatttcagcctcatcaaagctacgaagtgcccagatacgtccctc 721 



740 • 760 • 780 

722 atccgcccccaccaccaacttctcaccaggcagctcaggcgcagcctccacccccgggca 781 

i ii i i ii ii i i ii 

994 5 ======================tgcgtagtgttgtgggaagcggcagtgtaatctgcaca 998 2 

9960 • 9980 
800 • 820 • 
782 cacaggcccccgaagcccactgtgtggccgagtccacgatccctgaggcgggag 835 

lllllll ii 1 1 1 1 ii i I ii i i i I ii i ii 1 1 1 

9 9 83 aagaggcgcggggcgcgcaacgt=tgggag=gtcgttg==gcggcaggcgggaggccgtg 10038 

10000 • 10020 • 



10039 ctttaggggggttcaggtgaggcaaggctgtggggtaaccgtaggggaggcgggtgaggc 10098 
10040 • 10060 • 10080 

840 

836 cagccgggaa-ctc 848 

I Mill II 

100 9 9 ggctaagagggctaagggtcggcgggtgacgaagcagcagacggcggatatgggaatttc 1015 8 
10100 • 10120 • 10140 

860 • 880 • 900 

8 49 tggaccccg-ggaggacaccaaccctca-gcagcccac-caccgag-ggc--ca-cca-c 900 

ii i.iiiiii i mi i 1 1 1 1 i i in 

1015 9 agaatgaggtggcggattcaggcgaaaagggtgtgggctgtgcgagtgtcatgaggcagg 10218 
10160 • 10180 • 10200 
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920 • 940 • 960 

901 cgcggaaagaaactggtgcaggcctctgcgtccggagtggctcagtctaaggagcccacc 9 60 
| | | | | | | | | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
10219 cgcggaaagtcgctgcggcttgctggggcatggggggccgcgcattcctggaaaaagtgg 10278 
10220 • 10240 • 10260 

980 • 1000 ■ 1020 

961 acccccaaggccaagtctgtgtcagcccacctcaagtccatcttttgcgaggaattgctg 1020 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
10279 agggggcgtggccttcccccgcggccccccagcccccccgcacagagcggcgctacggcg 1033 8 
10280 • 10300 • 10320 

1021 aataaacgcgtggcttga 1038 

xxxxxxxxxxxxxxxxxx 

10 339 ggcgggcggcggggggtcggggtccgcgggctccgggggctgcgggcggtggatggcggc 10398 
10340 • 10360 * 10380 ' ~ • 



10399 ggacgttccggggatcgggggggtcggggggcgccgcgcgggcgcagccatgcgtgaccg 10458 
10400 • 10420 ■ 10440 • 



10459 tgatgagggggcagggtcgcagggggtgtgtctggtgggggcgggagcggggggcggcgc 10518 
10460 • 10480 • 10500 



10519 gggagcctgcacgccgttggagggtagaatgacagggggcggggacagagaggcggtcgc 1057 8 
10520 • 10540 • 10560 . 



105 7 9 gcccccggccgcgccagccaagcccccaaggggggcggggagcgggcaatggagcgtgac 10638 
10580 • 10600 • 10620 



10639 gaagggccccagggctgaccccggcaaacgtgacccggggctccggggtgacccagccaa 106 98 
10640 : 10660 10680 « 



10699 gcgtgaccaaggggcccgtgggtgacacaggcaaccctgacaaaggccccccaggaaaga 10758 
107 00 ■ 1072 0 • 10740 



1075 9 cccccggggggcatcggggggtggggcatggggggccgcgcattcctggaaaaagtggag 10818 
10760 • 10780 • 10800 



10 819 ggggcgtggccttcccccgcggccccccagcccccccgcacagagcggcgctacggcggg 1087 8 
10820 • 10840 • 10860 



1087 9 cgggcggcggggggtcggggtccgcgggctccgggggctgcgggcggtggatggcggcgg 1093 8 
10880 10900 • 10920 • 



10 939 acgttccggggatcgggggggtcggggggcgccgcgcgggcgcagccatgcgtgaccgtg 10998 
10940 ■ 10960 • 10980 
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10999 atgagggggcagggtcgcagggggtgtgtctggtgggggcgggagcggggggcggcgcgg 11058 
11000 • 11020 • 11040 



11059 gagcctgcacgccgttggagggtagaatgacagggggcggggacagagaggcggtcgcgc 11118 
11060 • 11080 • 11100 • 



11119 ccccggccgcgccagccaagcccccaaggggggcggggagcgggcaatggagcgtgacga 1117 8 
11120 ^ • 11140 • 11160 • 



1117 9 agggccccagggctgaccccggcaaacgtgacccggggctccggggtgacccagccaagc 112 3 8 
11180 • 11200 • 11220 



112 3 9 gtgaccaaggggcccgtgggtgacacaggcaaccctgacaaaggccccccaggaaagacc 112 9 8 
11240 • 11260 • 11280 



1129 9 cccgtggggcatggggggccgcgcattcctggaaaaagtggagggggcgtggccttcccc 11358 
11300 • 11320 • 11340 



1135 9 cgcggccccccagcccccccgcacagagcggcgctacggcgggcgggcggcggggggtcg 11418 
11360 " • 11380 • 11400 



11419 gggtccgcgggctccgggggctgcgggcggtggatggcggcggacgttccggggatcggg 114 7 8 
11420 ■ 11440 • 11460 



11479 ggggtcggggggcgccgcgcgggcgcagccatgcgtgaccgtgatgagggggcagggtcg 1153 8 
11480 " • 11500 • 11520 



115 3 9 cagggggtgtgtctggtgggggcgggagcggggggcggcgcgggagcctgcacgccgttg 1159 8 
11540 " . " 11560 ■ 11580 



11599 gagggtagaatgacagggggcggggacagagaggcggtcgcgcccccggccgcgccagcc 11658 
11600 ~ . 11620 • 11640 



11659 aagcccccaaggggggcggggagcgggcaatggagcgtgacgaagggccccagggctgac 11718 
11660 • 11680 11700 



11719 cccggcaaacgtgacccggggctccggggtgacccagccaagcgtgaccaaggggcccgt 
11720 ~ • 11740 • 11760 



11778 
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1177 9 gggtgacacaggcaaccctgacaaaggccccccaggaaagacccccggggggcatcgggg 11838 
11780 : 11800 ~ - 11820 . • 



11839 ggtggggcatggggggccgcgcattcctggaaaaagtggagggggcgtggccttcccccg 118 9 8 
11840 • 11860 • 11880 



1189 9. cggccccccagcccccccgcacagagcggcgctacggcgggcgggcggcggggggtcggg 11958 
11900 - 11920 " • 11940 ' ^ • 



11959 gtccgcgggctccgggggctgcgggcggtggatggcggcggacgttccggggatcggggg 12018 
11960 • 11980 ■ 12000 



12 019 ggtcggggggcgccgcgcgggcgcagccatgcgtgaccgtgatgagggggcagggtcgca 12078 
12020 " • 12040 • 12060 ■ 



12079 gggggtgtgtctggtgggggcgggagcggggggcggcgcgggagcctgcacgccgttgga 12138 
12080 ■ 12100 • 12120 



12139 gggtagaatgacagggggcggggacagagaggcggtcgcgcccccggccgcgccagccaa 12198 
12140 - 12160 • 12180 



12199 gcccccaaggggggcggggagcgggcaatggagcgtgacgaagggccccagggctgaccc 12258 
12200 ■■ 12220 • 12240 



12 2 59 cggcaaacgtgacccggggctccggggtgacccagccaagcgtgaccaaggggcccgtgg 12318 
12260 • 12280 • 12300 



12 319 gtgacacaggcaaccctgacaaaggccccccaggaaagaeccccggggggcatcgggggg 12 37 8 
12320 • 12340 ■• 12360 



12379 ggtgttggcgggggcatgggggggtcggatttcgcccttattgccctgtttagaattc 12 4 36 
12380 • 12400 • 12420 



% Identity = 1.6 (210/12958) 

HI 



DNA Strider 1.4f6 ### Wednesday, July 18, 2007 5:02:44 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragment complement.xdna => DNA Parallel 

DNA sequence 1038 bp atgctatcaggt . . . cgcgtggcttga linear 

DNA sequence 12436 bp gaattctaaaca . . . ctttgagaattc linear 



Method: Blocks (Martinez) 

Layout: Standard 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Translation: Off 



Alignment 10 . Comparison of nucleotide sequence 
of SEQ ID NO:3 with the complement of the 
nucleotide sequence of Fig. 2 of Bankier et al. 



1 gaattctaaacagggcaataagggcgaaatccgacccccccatgcccccgccaacacccc 60 

20 • 40 • 60 



6 1 ccccgatgccccccgggggtctttcctggggggcctttgtcagggttgcctgtgtcaccc 12 0 

80 • 100 ■ 120 



121 acgggccccttggtcacgcttggctgggtcaccccggagccccgggtcacgtttgccggg 18 0 

140 • 160 • ~ 180 



181 gtcagccctggggcccttcgtcacgctccattgcccgctccccgccccccttgggggctt 2 40 
: 2 0 0 • 2 2 0 -240 



241 ggctggcgcggccgggggcgcgaccgcctctctgtccccgccccctgtcattctaccctc 30 0 

260 ■ 280 ■ 300 



301 caacggcgtgcaggctcccgcgccgccccccgctcccgcccccaccagacacaccccctg 360 

320 • 340 • 360 



361 cgaccctgccccctcatcacggtcacgcatggctgcgcccgcgcggcgccccccgacccc 42 0 

380 • 400 • " 420 



421 cccgatccccggaacgtccgccgccatccaccgcccgcagcccccggagcccgcggaccc 4 80 
• " 440 • 460 ■ 480 



4 81 cgaccccccgccgcccgcccgccgtagcgccgctctgtgcgggggggctggggggccgcg 54 0 

500 " • 520 • 540 



541 



ggggaaggccacgccccctccactttttccaggaatgcgcggccccccatgccccacccc 
5 6 0 ■ 5 8 0 • 6 0 0 



600 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenWt8iaf>leftflat^iRM^> IHgA Parallel 



601 ccgatgccccccgggggtctttcctggggggcctttgtcagggttgcctgtgtcacccac 66 0 

620 • 640 660 



661 gggccccttggtcacgcttggctgggtcaccccggagccccgggtcacgtttgccggggt 72 0 

680 • 700 ■ 720 



721 cagccctggggcccttcgtcacgctccattgcccgctccccgccccccttgggggcttgg 7 80 

740 • 760 • 780 



781 ctggcgcggccgggggcgcgaccgcctctctgtccccgccccctgtcattctaccctcca 84 0 



841 acggcgtgcaggctcccgcgccgccccccgctcccgcccccaccagacacaccccctgcg 9 00 

860 • 880 • 900 



901 accctgccccctcatcacggtcacgcatggctgcgcccgcgcggcgccccccgacccccc 9 60 

920 • 940 « 960 



961 cgatccccggaacgtccgccgccatccaccgcccgcagcccccggagcccgcggaccccg 102 0 

-980 . • 1000 ■ 1020 



1021 accccccgccgcccgcccgccgtagcgccgctctgtgcgggggggctggggggccgcggg 10 8 0 

1040 • 1060 ■ 1080 



1081 ggaaggccacgccccctccactttttccaggaatgcgcggccccccatgccccacggggg 114 0 

1100 • 1120 • 1140 



1141 tctttcctggggggcctttgtcagggttgcctgtgtcacccacgggccccttggtcacgc 12 00 

1160 • 1180 • 1200 



12 01 ttggctgggtcaccccggagccccgggtcacgtttgccggggtcagccctggggcccttc 12 60 

1220 ■ 1240 • 1260 



12 61 gtcacgctccattgcccgctccccgccccccttgggggcttggctggcgcggccgggggc 132 0 

1280 • 1300 • 1320 



1321 



gcgaccgcctctctgtccccgccccctgtcattctaccctccaacggcgtgcaggctccc 
1340 • 1360 • 1380 



1380 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenP/t8ifl|>le&fliatAafflVI=> IMgA Parallel 



13 81 gcgccgccccccgctcccgcccccaccagacacaccccctgcgaccctgccccctcatca 14 4 0 

1400 • 1420 • 1440 



14 41 cggtcacgcatggctgcgcccgcgcggcgccccccgacccccccgatccccggaacgtcc 15 00 

1460 • 1480 • 1500 



1501 gccgccatccaccgcccgcagcccccggagcccgcggaccccgaccccccgccgcccgcc 15 6 0 

1520 1540 • ~ 1560 



1561 cgccgtagcgccgctctgtgcgggggggctggggggccgcgggggaaggccacgccccct 162 0 

1580 ■ 1600 • ~ 1620 



1621 ccactttttccaggaatgcgcggccccccatgccccaccccccgatgccccccgggggtc 16 80 

1640 • 1660 • 1680 



1681 tttcctggggggcctttgtcagggttgcctgtgtcacccacgggccccttggtcacgctt 174 0 

1700 • 1720 • 1740 



17 41 ggctgggtcaccccggagccccgggtcacgtttgccggggtcagccctggggcccttcgt 180 0 

1760 • 1780 . ■ 1800 



1801 cacgctccattgcccgctccccgccccccttgggggcttggctggcgcggccgggggcgc 18 60 

1820 • 1840 • " " . ~ i 86 o 



1861 gaccgcctctctgtccccgccccctgtcattctaccctccaacggcgtgcaggctcccgc 192 0 

1880 • 1900 • 192 0 



1921 gccgccccccgctcccgcccccaccagacacaccccctgcgaccctgccccctcatcacg 19 80 

1940 • 1960 • 1980 



1981 gtcacgcatggctgcgcccgcgcggcgccccccgacccccccgatccccggaacgtccgc 2 04 0 

2000 • 2020 2040 



2 041 cgccatccaccgcccgcagcccccggagcccgcggaccccgaccccccgccgcccgcccg 2100 

2060 • 2080 • " 2100 



2 101 ccgtagcgccgctctgtgcgggggggctggggggccgcgggggaaggccacgccccctcc 2160 

2120 • 2140 ■ " 2160 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenf/t8iftf>le&flat4iaiRM=> ItigA Parallel 



2161 actttttccaggaatgcgcggccccccatgccccagcaagccgcagcgactttccgcgcc 2 220 

2180 • 2200 ■ 2220 



2 221 tgcctcatgacactcgcacagcccacacccttttcgcctgaatccgccacctcattctga 22 80 

2240 • 2260 • 2280 



22 81 aattcccatatccgccgtctgctgcttcgtcacccgccgacccttagccctcttagccgc 2 34 0 

2300 ■ 2320 • 2340 



2 341 ctcacccgcctcccctacggttaccccacagccttgcctcacctgaacccccctaaagca 2 4 00 

2360 • 2380 • 2400 



2401 cggcctcccgcctgccgccaacgacctcccaacgttgcgcgccccgcgcctctttgtgca 2 4 60 

2420 • 2440 • 2460 



24 61 gattacactgccgcttcccacaacactacgcactcccccttctgattgccgcactgcctt 2 52 0 

2480 • 2500 • 2520 



2 521 tccatttcctgttgcacttggccaccgcattcccacagcttgccccccggggacccgctt 2580 

2540 • 2560 ~ • 2580 



2581 ttctaacacaaacacacgctttctacttcccctttctacgcttacatgcacacacacacc 2640 

2600 • 2620 " • 2640 



2 641 gccgctttcgggaaatctgtacccgtactgcctccggcagaccccgcaaatccccccggg 27 00 

2660 • 2680 • 2700 



2701 cctacatcccaagaaacacgcgttactctgacgtagccgccctacataagcctctcacac 27 60 

2720 • 2740 • 2760 



27 61 tgctctgcccccttctttcctcaactgccttgctcctgacacactgccctgaggatggaa 2 82 0 

2780 • 2800 ■ 2820 



2 821 cacgaccttgagaggggcccaccgggcccgcgacggccccctcgaggaccccccctctcc 2880 
• 2840 " " ~ ■ 2860 * 2880 



2 881 tcttccctaggccttgctctccttctcctcctcttggcgctactgttttggctgtacatc 2 94 0 

2900 • 2920 ■ " 2940 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenWi8ift]Hefiiaat4tf iRM=> WH&A PSrallel 



2 941 gttatgagtgactggactggaggagccctccttgtcctctattcctttgctctcatgctt 30 0 0 

2960 • 2980 ■ 3000 



3 001 ataattataattttgatcatctttatcttcagaagagaccttctctgtccacttggagcc 3 0 60 

3020 • 3040 • 3060 



3061 ctttgtatactcctactgatgagtaagtattacaccctttgccccacaccccctttccct 312 0 

3080 • 3100 • 3120 

20 

1 atgctatcaggtaacgcagg 20 

xxxxxxxxxxx| I I I I I I XX 
3121 tactcttccttctctaacgcactttctcctctttccccagtcaccctcctgctcatcgct 3180 

3140 • 3160 - 3180 



3181 ctctggaatttgcacggacaggcattgttccttggaattgtgctgttcatcttcgggtgc 32 4 0 
■ ' 3200 • 3220 - 3240 



32 41 ttacttggtaagatctaacattccctaggaattatttaccacacccccacttttccaacc 330 0 
• 3260 • 3280 • 3300 



33 01 ctaacactcttttttcaacgcagtcttaggtatctggatctacttattggagatgctctg 3360 

3320 • 3340 * ~ 3360 



3361 gcgacttggtgccaccatctggcagcttttggccttcttcctagccttcttcctagacct 342 0 

3380 • 3400 3420 



34 21 catcctgctcattattgctctctatctacaacaaaactggtggactctattggttgatct 34 80 

3440 • 3460 • 3480 



34 81 cctttggctcctcctgtttctggcgattttaatctggatgtattaccatggacaacgaca 354 0 

3500 • 3520 • 3540 



3541 cagtgatgaacaccaccacgatgactccctcccgcaccctcaacaagctaccgatgattc 3 600 

3560 • 3580 • 3600 



3 601 tggccatgaatctgactctaactccaacgagggcagacaccacctgctcgtgagtggagc 3 660 

3620 • 3640 • 3660 



3661 cggcgacggacccccactctgctctcaaaacctaggcgcacctggaggtggtcctgacaa 3 72 0 

3680 • 3700 • 3720 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et aL EcoRI Dhet fragmenff/i8ift^Iefoaat4rtIiRM=> tti&A Parallel 



3 721 tggcccacaggaccctgacaacactgatgacaatggcccacaggaccctgacaacactga 37 8 0 

3740 • 3760 • 3780 



37 81 tgacaatggcccacatgacccgctgcctcaggaccctgacaacactgatgacaatggccc 384 0 

3800 • 3820 3840 



38 41 acaggaccctgacaacactgatgacaatggcccacatgacccgctgcctcatagccctag 3900 

3860 • 3880 • 3900 



39 01 cgactctgctggaaatgatggaggccctccacaattgacggaagaggttgaaaacaaagg 396 0 

3920 • 3940 " • 3960 



39 61 aggtgaccagggcccgcctttgatgacagacggaggcggcggtcatagtcatgattccgg 4 02 0 

3980 • 4000 ■ ~ 4020 



4 021 ccatggcggcggtgatccacaccttcctacgctgcttttgggttcttctggttccggtgg 4080 
• 4040 ■ 4060 • 4080 



4 081 agatgatgacgacccccacggcccagttcagctaagctactatgactaacctttctttac 414 0 

4100 ■ 4120 ■ 4140 



4141 ttctaggcattaccatgtcataggcttgcctgactgactctccctccatttactgggaat 42 0 0 

4160 • 4180 • 4200 



42 01 gccttagctaatcaccttaactggcacacactcccttagccacactgtctgtctaggctg 42 6 0 

4220 ■ 4240 • 4260 



42 61 aaaagccacattcatattctatttcaaaacaaggggaaaggaggacatgcgagaattggc 432 0 

4280 • 4300 • 4320 



4321 agacacctttacccagcccttaacacaccacacaggtagcaaggacccgggcgttgccag 438 0 

4340 ■ 4360 4380 



43 81 actccgccaccaacgcccctgcgttgaacccacccctcctacacacatcagacctctgca 444 0 

4400 • 4420 • 4440 



4441 



caacacaactaccaggcagatgaggccccttacttccacagggtactggcataccagcgg 
4460 ■ 4480 ■ 4500 



4500 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenf/t8ift^IefoflfltAfiiRiVfc=> WE$A Parallel 



45 01 gggaccacatacatccctgtctcccacccagtaactccagcaactttgctttccatcttg 4560 

4520 • 4540 • 4560 



45 61 tgccaatacacatttggattcagcccaagccacacctaactcatgccagcagaggcagga 4 62 0 

4580 • 4600 • 4620 



4 621 acacctgttgttgacacattctttgcgcataagcactttaatccctctctcacacccaga 4 680 

4640 • 4660 • 4680 



4 681 aactaagagctagcccaaaacctccacacctgtcctcgctcatctttccacattcctctg 4 74 0 

4700 • 4720 • 4740 



4 741 gccttctttccttgtccttactgtataaaagtccacgaaaacagctgtgcctcactctcg 4 80 0 

4760 • 4780 • 4800 



4 801 agatggtacacgtcctggagcgtgctttgctagagcagcagtcctctgcctgcggcctgc 4860 

4 8 2 0 • 4 8 4 0 • 4 8 6 0 



4 861 ccggctcttctacggagaccaggcctagccacccctgccccgaggacccagacgtcagca 4 92 0 

4880 ■ 4900 • 4920 



4 921 gactaagactactcctggtggtactctgtgtcctgtttggacttttatgcctgctcctca 4980 

4940 ■ 4960 ■ 4980 



4 981 tctaagaagccaccatgcgaccgggtagaccactggctggattctacgctactctccgcc 5 04 0 

5000 ■ 5020 • 5040 

2 1 agaaggagc 2 9 

1 1 1 1 1 1 1 1 1 

5041 gttccttcagaagaatgtccaaaaggtcaaagaacaaggccaagaaggagcgtgtccccg 510 0 

5060 • 5080 -5100 



5101 tggaggaccgcccaccgactccgatgcccaccagccagcgactgatccgcagaaacgcgt 5160 

5120 • 5140 • 5160 



5161 tgggaggaggcgtccgccccgatgcggaggactgcatccaacgcttccaccccctggagc 52 2 0 

5180 • 5200 • 5220 



52 21 cagcgctgggggtgtcaacaaagaactttgacctgttgtccctgagatgtgaattgggat 52 8 0 

5240 • 5260 " • 5280 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmenf/t8ift|>lefiiflfltAadavI=> EtfgA PSrallel 



52 81 ggtgtggataacatctcccgctagatggcgcccttattattgatgtgacttgtgatgcaa 53 4 0 

5300 • 5320 " - • 5340 



5 341 taaataaaagtacagatagatggcactcttaccttcctctgcccgcttcttcgtatatgt 54 0 0 

5360 • 5380 " • " 5400 



5 401 gttgagatgagtcatcccgtggagagtagggagggggagggagcccgtcattcccgtcgt 54 6 0 

5420 • 5440 ■ 5460 



54 61 gttgcaatcccaagtacagactttgatcttgggttcctagtggttgatagtccgagtgac 552 0 

5480 • 5500 • " 5520 



55 21 ggtcgccattgccccaatatgggtcctcataaggcggtgggggctcttcattagattcac 558 0 

5540 • 5560 • 5580 



55 81 gttcctcatcgttcggtggggtgggggtgttcccagaagagccagaagcagatggatatt 5 64 0 

5600 • 5620 • 5640 



5641 gggagttgtttccgccatcgtacccatccggatccccgccggggctagggggacccgcgc 570 0 

5660 • 5680 • 5700 



57 01 ccattggcaccatttctagggaccccatagctgcagcagcgactgcaaaaccggcctgaa 5760 

5720 5740 ■ 5760 



57 61 atgctctctgagaaacaaggcgagagggatttacgccccagcaagcttcaggcaaaataa 582 0 

5780 • 5800 • 5820 



5 821 cctgtgatgctgaaaccgcggcgctacaggatcaccccaattggcaagacctgggcggag 5880 
• ~ 5840 • 5860 • ~ 5880 



5 881 ttcctgagagcccaggggtctcgtgcaggtgtccccggggaattgttctccctgatcacc 594 0 

5900 • 5920 • 5940 



5941 gcccacccccgttttctccaaaccagcagctgtgacaatcttcacacactgctgctgtca 6 0 0 0 

5960 ~ ■ 5980 • 6000 



6001 



cctggaactattttcccacggtgcccttccgcccattttcccacgagtcgcgaggctatc 
6020 • 6040 • 6060 



6060 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmentf/t8ia^lefoflatAfiiRM=> ItigA Pflrallel 



6 061 cacccgcaatgccacccccccaatgccacactaaaacaaggtgaaataggcaagtgcgtt 612 0 

6080 • 6100 • 6120 



612 1 tattgcgacaagtatccagaaacataaaccccgtgggcttcctccttgtcatttttccca 618 0 . 

6140 • 6160 • 6180 

30 aacag 34 

Mill 

6181 acgcaggtcactggcaggtgccagggcttgggaagtgacaggtcaacagcaacagagagg 62 4 0 

6200 • 6220 • ~ 6240 



62 41 ctcccatccttttccttcataacaccgccatttgccgcagttggtgcgggctccacgccc 6 30 0 

6260 • 6280 • 6300 



6 301 tcgggcatgagccactggacgtggggatggggaaatgcattcacggtgcatgtcacagta 63 6 0 

6320 * 6340 . 6360 



6361 aggacagagaagtctgggaactgagacctttcggagtggacagacagcgttagaggcttc 64 2 0 

6380 • 6400 • 6420 



6421 accacgctcaggtgttcctgcttggtgacctcggtctcgcccagtttcatgcggcacagg 64 80 

6440 ~~ - 6460 ■ " 6480 



64 81 tagttgccgtcatgggagatgttggcagcggtgactactaaaaagaaggtgttggcactt 65 4 0 

6500 • 6520 « 6540 



6541 ctgtggatatcaaagaagcccctgaaaggccactctataaagatgacatcgtggtgcatg 66 00 

6560 • 6580 - 6600 



6601 cgcccaataagcacctgctcctctcctgggcccagtttaaaccagctgacctcaatctct 6 6 60 

6620 • " 6640 • 6660 



6 6 61 ggaccgaggctcaccctcctccagtaggaggtcagggtgactcgctcacccaagaaagcg 6 72 0 

6680 ' 6700 • 6720 

35 cctg 38 

1 1 M 

6 721 gtgacagcctggccggcggccacacaggaggccaacaggaggagctgagcgatgaacctg 67 80 

6740 6760 • 6780 



67 81 gccattgctctggactctcctcacccaggcctcgcgtcttatactattctgccacgccca 6 84 0 

6800 ■ 6820 • 6840 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmeriI/lS/fli^leSb6at4AifM=> ffi#A Mrallel 



6 841 ttttatcatataagcctgaagcccgtgagctggcctgacgagaccatgaggccagccaag 6 9 00 

6860 • 6880 • 6900 



6 901 tctacagattctgtgtttgtgaggaccccggtcgaggcgtgggtcgcgccctcgccgccg 696 0 

6 92 0 • 6 940 • 696 0 



6961 gacgacaaggtggctgagtccagctacctcatgttcagggccatgtacgcggtgttcacc 7 02 0 

6980 ■ 7000 • 7020 



7 021 cgggatgagaaagacctgcctttgccagccctggtcctctgccggctcatcaaggcctcc 7 080 

7 04 0 • 7060 ■ 7080 

40 

39 cggaggt 45 

ill li 

7 081 ctgaggaaggataggaagctgtacgcggagctggcctgcaggacagccgacatcgggggc 714 0 

7100 • 7120 • 7140 

60 • 80 

46 tcggccgccgcgggccaggacctcatcagcgtcccccgcaacacctttatgaca 99 

III | || | | | | | | || || || | xxxxxxxxxxxxxxxxxxxx 

7141 aaagacacgcacgtacggctcatcatcagcgtcctgcgcgcagtgtacaacgaccactac 72 0 0 

7160 ■ 7180 • 7200 



72 01 gactactggtcgcggctcagggtggtgctgtgctacacagtggtgtttgcggtgcgaaac 7 2 60 

7220 "* " " " . 7 24o " - 7260 



7261 tacctggatgaccacaagagcgccgccttcgtgctgggggcaatcgcccactacctggcc 7 32 0 

7280 • 7300 " • 7320 



7 321 ctctatcgcagactctggtttgcgaggctgggcggcatgccaagatcgctgagacgtcag 7380 

7340 • 7360 • 7380 



7 381 ttccccgtgacgtgggccctggccagcctgactgacttcctgaaatctttgtaaatgaat 7440 

7400 7420 • 7440 



7 441 aaacagtgggtgttgcgtgatgagtaaagtgtaacatttaatgtgggactgggaggccgg 750 0 

7460 • 7480 • 7500 



7 501 ggcgataccttgggcatcatgcagggtgcacagactagcgaggataatctgggcagccag 7 5 60 

7520 • 7540 • 7560 



75 61 agccagccgggtccgtgcggctacatctacttttaccccctggccacctaccctcttagg 
• 7580 • 7600 " ' ■ 7620 



7620 



» 

9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmeriI/i8/fii^Ie5bdat4rfd*M=> ffi^A Firallel 



7 621 gaggtggccacactggggaccggctacgcgggccacaggtgcctgacggtgccgctcctt 7 6 80 

7640 • 7660 • * 7680 



7681 tgc ggc ate accgtg gage cgggcttc age ate aatgtcaaggctctgcacaggaggccc 77 40 

7700 • 7720 • 7740 



7741 gaccccaactgcgggctcctacgcgctacctcctatcacagggacatctacgtgttccac 7 800 

7760 • 7780 • " 7800 



7 801 aatgcccatatggttccccccatctttgaggggccgggtctcgaggccctctgtggcgag 7 86 0 

7820 • 7840 • 7860 



7 861 accagggaggtgtttgggtacgacgcctacagcgccctaccgagggaaagctccaagccg 7 92 0 

7880 • 7900 • 7920 



7 921 ggggacttcttccccgaagggctagatccctctgcctacctgggggcggtggcaataacc 7 9 80 

7940 • 7960 ■ 7980 



7 981 gaggccttcaaggagcgactctacagcggaaacctggtggccattccatcgttaaaacag 8 04 0 
• " ~ 8000 " • 8020 • 8040 



8 041 gaggtagcggtggggcagtctgcgagcgttagggtcccgctctacgacaaggaggtgttc 8100 

8060 ■ 8080 ■ 8100 



8101 ccagagggcgtgccccacjctccgccagttttacaactcggacctcagccgctgcatgcac 816 0 

8120 ■ 8140 • 8160 



8161 gaggcgctgtacaccgggctggcgcaggcgctgcgcgtccgacgggtgggcaagctggtg 82 2 0 

8180 ■ 8200 • 8220 



82 21 gagctgctggagaagcagagcctgcaggaccaggccaaggtggccaaggtggcccccctc 82 8 0 

8240 • 8260 ■ 8280 



82 81 aaggagttcccagcctcaaccatcagtcacccggactcgggagccttaatgattgtggac 8 34 0 

8300 • 8320 • 8340 



8341 



agcgcggcatgcgagctggcggtgagctacgcacccgccatgctggaggcctcgcacgag 
8360 • 8380 • 8400 



8400 



9310-13DVCTDV SEQ ID NO 3-xdna x Bankier et al. EcoRI Dhet fragmeriI/18yflSi}Ieft2eat43*l*M=> WgA Mrallel 



84 01 accccggccagcctcaactacgactcgtggcccctgtttgccgactgtgagggtccagag 8460 

8420 ■ 8440 " • 8460 



84 61 gcccgtgtggctgcgttacaccgatataatgccagcctggccccccacgtgtccacgcag 852 0 

8480 ■ 8500 • 8520 



85 21 atctttgccaccaattccgtcctctacgtctcgggggtctcgaagtcaaccggtcagggc 85 80 

8540 • 8560 " • ~ 8580 



8581 aaggagagtctctttaacagtttctacatgacccacggcctggggaccctgcaggagggg 8 64 0 

8600 • 8620 • 8640 

100 

100 ctgcttc - 106 

I I I I I I I 

8 641 acctgggacccctgccgccgaccctgcttctcgggctggggtgggccagacgtgaccgga 8 7 00 

8660 • 8680 • 8700 

107 agacc 111 

I II 

87 01 accaacggtccgggaaactacgctgtggagcacctggtctatgcggcctccttctcgccc 87 60 

8720 ■ 8740 • 8760 

112 aacct 116 

I I I I I 

87 61 aaccttcttgcccgctatgcctactacctgcagttttgccagggacagaagagctctctg 8 82 0 

8780 8800 ~ • 8820 



8 821 accccggtgccggagacgggcagctacgtggcgggggcggccgccagtcccatgtgctcg 88 80 
• " 8840 • 8860 • 8880 



8881 ctctgcgagggccgggccccggccgtgtgcctgaacacgctcttctttaggctgagggac 89 4 0 

8900 • 8920 • 8940 



8941 cgcttcccccccgtcatgtccacgcagcggagggacccctatgtgatctcgggggcctcg 9 0 00 

8960 8980 • 9000 

117 g H7 

9 001 ggctcctacaacgagacggactttttgggcaactttctcaacttcatcgataaggaggac 90 60 

9020 • 9040 • 9060 

120 140 

118 gacaacaaaccgcc-gaggcaga-ccccgctac 148 

mi ii 1 1 1 in 1 1 1 1 1 1 1 1 1 1 1 

9061 gacgggcagcggccggacgacgagccccgctacacctactggcagctgaaccagaacctg 912 0 

9080 • 9100 • ^ 9120 



9121 



ctggagcggctgtctcggctgggcatagacgctgaaggaaagctagagaaggagccccat 9180 
• 9140 • 9160 • 9180 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmeifl/Mh(flSi)IeSnftat4j«ffM=> EXgA Parallel 



9181 ggcccgcgtgactttgtcaagatgttcaaggacgtggatgcggcggtggacgccgaagtg 92 4 0 

9200 • 9220 • 9240 



92 41 gtccagtttatgaacagcatggccaagaacaacatcacctacaaggacctggtcaagagc 930 0 

9260 • 9280 • " 9300 



9 301 tgctaccacgtgatgcagtactcgtgcaacccctttgcgcagcccgcctgccccatcttc 93 6 0 

9320 • 9340 9360 



9 361 acccagctgttttaccgctcactgctgaccatcctgcaggacatctccctgcccatctgt 94 2 0 

9380 • 9400 • 9420 



9 421 atgtgctatgagaatgacaaccccgggcttggccagagccccccagagtggctaaagggt 94 8 0 

9440 • 9460 ■ 9480 



9481 cactaccagacgctgtgcaccaactttaggagcctggccatcgacaagggggtcctcacg 954 0 

9500 • 9520 • 9540 



95 41 gccaaggaggccaaggtggtgcatggggagcccacctgcgacctgccagacctggacgcg 96 00 

9560 • 9580 • 9600 



9 601 gccctgcagggccgggtgtacggccggcggctgcctgtgcgcatgtccaaggtgctgatg 96 60 

9620 • 9640 • 9660 



9661 ctgtgccccaggaacatcaagatcaagaacagggtggtcttcacgggggagaatgccgcc 972 0 

9 6 8 0 . 9 7 0 0 • 9 7 2 0 



97 21 ctccagaacagcttcatcaagtccactaccaggagggagaactacatcatcaacgggccc 97 8 0 

9740 • 9760 • 9780 



97 81 tacatgaaattcctcaacacctaccacaagaccctattcccggacactaagctctcaagc 98 40 

9800 • 9820 • 9840 



9 841 ctgtacctgtggcacaacttttccaggcggcgctcggtccctgtccccagcggggccagc 99 0 0 

9860 * 9880 • 9900 



9901 



gcggaggagtactctgacctggccctctttgtggacgggggctcccgggcccacgaagag 
• 9 9 2 0 • 9 9 4 0 ■ 9 9 6 0 



9960 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmeiO/taitNp\e5n%te4tfLVM=> EftgA I4rallel 



99 61 agcaacgtcatagatgtggtgcctggcaacctggtcacttacgccaagcagaggctcaac 1002 0 

9980 \ 10000 • 10020 



10 021 aacgccatcctgaaggcgtgcggccagacccagttctacatcagcctgattcagggactg 10080 

10040 • 10060 • 10080 



10081 gtgccgaggacgcagtcggtgcccgcccgtgactacccccacgtactgggcacgcgggcg 1014 0 

10100 ■ 10120 • 10140 

149 cctacgcgg 157 

MINIM 

10141 gtggagtcggcagcggcctacgcggaggccacctcctcccttactgcgaccacggtggtc 10200 

10160 * 10180 • 10200 

160 

158 ccccgctgccc 168 

I lllllllx 

102 01 tgcgcggccacagactgtcttagccaggtctgcaaggcccgtccggttgtcacgctgcca 10260 

10220 • 10240 • 10260 

180 

169 cccttttcccaccagg 184 

xxxxxxxxxxxxxxxx 

10261 gtgaccatcaacaagtacacgggggtcaacggcaacaaccagatattccaggccgggaac 10320 

10280 ■ 10300 • 10320 



10321 ctgggatactttatgggccggggcgtggacaggaacctgctgcaggcccccggggctggg 1038 0 

10340 " • .10360 • 10380 



10381 ctgcgcaagcaggccgggggctcttccatgcggaagaagtttgtctttgccacccccacc 10440 

10400 ■ 10420 ■ 10440 



10441 ctagggttgaccgtgaagcgccggacccaagccgcgaccacatatgagattgagaacatc 10500 

10460 • 10480 • 10500 



10501 agggctggcctggaggccattatatcacaaaaacaggaggaagactgtgtgtttgatgtg 1056 0 

10520 • 10540 • 10560 



105 61 gtgtgcaaccttgtggatgccatgggcgaggcatgcgcctcgctgactagggacgacgcg 10 620 
• 10580 • . 10600 • 10620 



10 621 gagtacttattgggccgcttctccgtcctggcggacagcgtcctagaaaccctggcgacc 10680 

10640 10660 * 10680 



10681 



attgcctccagcgggatagagtggacggcggaggccgctcgggactttctggagggagtg 
10700 10720 • 10740 



10740 



9310-13DVCTDV SEQ ID NO 3 xdna x Bankier et al. EcoRI Dhet fragmeriI/lfl/ftS%)lefeeat4)ddtM=> ffi#A ISrallel 



10741 tggggtgggcccggggcagcccaggacaactttatcagcgtggccgagccggtcagcacc 10800 

10760 • 10780 • 10800 



10 801 gcgtcgcaggcctcggccgggctgctgctgggtggaggagggcagggctccgggggcaga 10860 

10820 • 10840 • 10860 



10 861 cgcaagcgccgtctggccaccgttctccccggactcgaggtctagagacccctggggcgg 1092 0 

10880 • 10900 • 10920 



10 921 cgatgtcggggctgctggcggcggcgtacagccaggtgtacgccctggcggttgagctga 10980 

10940 • 10960 • 10980 



109 81 gcgtgtgcacccggctggacccccggagtctggacgtggctgcggtggtgcgcaacgccg 11040 

11000 • 11020 • 11040 



11041 gcctgctggccgagctggaggccatcctccttccccgtttgagacggcagaatgaccgtg 11100 

11060 ■ 11080 • 11100 



11101 catgcagcgccctgtccctggagctggtgcacctgctagagaactcgagagaggcctctg 1116 0 

11120 • 11140 ■ 11160 



11161 ccgcgctgctcgcccctggtagaaagggtacccgggtcccgcctctccgtaccccctcag 112 2 0 

11180 • 11200 • 11220 



112 21 tcgcgtactctgtggagttttacggggggcataaagtcgatgtaagtttgtgcctaataa 112 80 

11240 • 11260 " • 11280 

185 caatagc 191 

Mil l 

112 81 atgacatagagattttaatgaagagaatcaatagcgtgttttattgcatgtctcacacca 11340 

11300 • 11320 ■ 11340 



11341 tggggctggagagcctggaacgggccctggatctgctgggccgctttcggggcgtaagtc 114 00 

11360 • 11380 • 11400 



114 01 ccatcccagacccgc gee tctac ate acctctgtgccctgctggcgctgtgtgggc gage 11460 

11420 ■ 11440 ^ • 11460 



114 61 tgatggttctgcccaaccacggcaacccttccacggcagaggggacccacgtctcctgta 1152 0 

11480 • 11500 • 11520 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmeriI/i8/fli^lefe§ai4Al#M=> H*£A Mrallel 



11521 accacctggcggtgccggtgaatccggagccggtctcgggactgtttgagaatgaagtcc 115 80 

11540 • 11560 • 11580 



115 81 gccaggcggggctcgggcacctgttggaggctgaggagaaggcgaggccgggcggcccag 11640 

11600 11620 11640 
200 ■ 220 
192 --caccgcgccttcctacggtcctgg-ggccggag 223 

I 1 1 1 1 1 1 1 I I II MINIM 

11641 aggagggcgcggtcccgggcccggggcggccggaggcagagggggcgaccagagcgctgg 11700 

11660 . • 11680 • " 11700 



117 01 acacctacaacgtcttctcgacagtgcccccggaggtggcggagctctcggagctcctct 11760 

11720 • 11740 • 11760 



117 61 attggaactctggcggccatgctatcggtgcaacggggcagggggagggtggcggccatt 11820 

11780 • 11800 • 11820 



11821 cccgcctctctgccctgtttgcccgggagcgtcgcctggccctggtgcggggggcctgcg 11880 

11840 • 11860 • 11880 



11881 aggaggcgctggcgggggcaaggctgactcacctgtttgacgccgtggctcccggggcca 11940 

11900 • 11920 " • 11940 

224 cgg-tc 228 

III II 

11941 cggagcggctcttctgcggcggggtctacagctcctcgggcgacgcggtggaggcgctga 12000 

11960 ■ 11980 ' • 12000 



12001 aggcggactgcgccgccgccttcacggcgcacccccagtaccgggccatcctgcaaaaga 12 060 

12020 • 12040 • 12060 



12 061 ggaacgagctgtacacgcggctcaaccgagccatgcagcggttgggccgaggcgaggagg 1212 0 

12080 12100 • ~ 12120 

240 • 260 

229 gccccggccggcggctactttacctccccaggagg 263 

ii 

12121 aggcgtcccgggagagcccggaggtgccccggccggc ======================= 12157 

12140 

280 ■ 300 • 320 

2 64 ttactacgccgggcccgcgggcggggacccgggtgccttcttggcgatggacgctcacac 32 3 



324 



340 • 360 • 380 

ctaccacccccacccacacccccctccggcctactttggcttgccgggcctctttggccc 



383 



v . v 

• * • 

9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmeifl/lft(fli|)leSbfiat4!«fM=> ffi#& flrallel 

400 • 420 ■ 440 

3 84 ccctccaccgtgcctccttactacggattcccacttgcgggcagactacgtccccgctcc 4 43 



460 • 480 • 500 

44 4 ctcgcgatccaacaagcggaaaagagaccccgaggaggatgaagaaggcggggggctatt 50 3 

Mill III III I 

12158 ========================s========tggggcacg=agagcccggcccgtccgg 1218 4 

12160 • 12180 

520 • 540 ■ 560 

5 04 cccgggggaggacgccaccctctaccgcaaggacatagcgggcctctccaagagtgtgaa 56 3 

I I Mill I III I llllllll 

12185 cgccctctcggacg===cgctcaagcgcaagga================= ========== 12214 

12200 

580 • 600 • 620 

5 64 tgagttacagcacacgctacaggccctgcgccgggagacgctgtcctacggccacaccgg 62 3 

Ml llllllll II I 

12215 ================== gcagtacctgcgccaggtgg====================== 12234 

12220 

640 • 660 • 680 

62 4 agtcggatactgcccccagcagggcccctgctacacccaGtcggggccttacggatttca 6 83 



700 • 720 • 740 

684 gcctcatcaaagctacgaagtgcccagatacgtccctcatccgcccccaccaccaacttc 7 43 



760 • 780 • 800 

744 tcaccaggcagctcaggcgcagcctccacccccgggcacacaggcccccgaagcccactg 8 03 



82a ; 840 • 860 

804 tgtggccgagtccacgatccctgaggcgggagcagccgggaactctggaccccgggagga 8 63 



880 • 900 • 920 

864 caccaaccctcagcagcccaccaccgagggccaccaccgcggaaagaaactggtgcaggc 92 3 

i ii i ii 1 1 ii i 

12 2 35 ====================cc accgagggtct============================ 122 47 

12240 

940 • 960 ■ 980 

92 4 ctctgcgtccggagtggctcagtctaaggagcccaccacccccaaggccaagtctgtgtc 9 8 3. 

1 1 MM I III ' 

12 248 ============================================ =ggcc a age tgcagtc 12262 

12260 

1000 • 1020 

984 a-gcccacctcaa — gtcc-atcttttgcgag-gaattgctgaataaacgcgtggcttga 103 8 

Ml I III I I I II I II II I II Ml II 

12 2 63 ctgcctggcgcaacagagcgagaccctgaccgagaccctgtgcctgcgcgtctgggggga 12 32 2 
• ' 12280 " • 12300 • 12320 



12 323 cgtggtctactgggagctggcccgcatgcgcaaccacttcctctacagacgggccttcgt 12382 

12340 ■ 12360 • 12380 



12 3 83 ctcgggtccctgggaggacaggcgcgccggcgagggtgccgcctttgagaattc 12 43 6 

• 12400 • 12420 



% Identity = 1.8 (239/13074) 



9310-13DVCTDV SEQ ID NO 3.xdna x Bankier et al. EcoRI Dhet fragmeifl/l»ftSl)leSbeat4Al»M=> ffi#A ffgrallel 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:33:28 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BA-LF3.xprt => Protein Alignment 

Protein sequence 177 aa MARRLPKPTLQG . . . DTAPRGARKKQ* 

Protein sequence 609 aa *GRRGVLIGPLL ... DRRAGEGAAFEN 



Method: Diagonals (BLOSUM62) 

Layout: Standard 

Block Length <: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BL0SUM62 



20 



Alignment 11 . Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ED NO: 1 
(SEQ ID NO:2) with the amino acid sequence, BA-LF3, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 



40 



60 



1 MARRLPKPTLQGRLEADFPDSPLLPKFQELNQNNLPNDVFREAQRSYLVFLTSQFCYEEY 6 0 
RR R P +P Q + + Q y + 

1 *GRRGVLIGPLLRPGGQRPRNPGDHCLQRDRVDGGGRSGLSGGSVGWARGSPGQL=YQRG 5 9 

20 -40 

80 • 100 • 120 

6 1 VQRTFGVPRRQRAIDKRQRASVAGAGAHAHLGGSSATPVQQAQAAASAGTGALASSAPST 12 0 

V RA + RA + G ST + AA+GAAS 

6 0 RAGQHRVAGLGRAAAGWRRAGLRGQTQAPSGHRSPRTRGLETPGAAMSGLLAAAYSQVYA 119 
60 • 80 100 

140 160 
121 -AVAQSATPSVSSSISSLRAATSGATAAASAAAAVDTGSGGGGQPHDTAPRGARKKQ* — 177 

AVS + +AAAA+ +A 

12 0 LAVE LSVCTRLDPRSLDVAAVVRNAGLLAE LEA ILL PRLRRQNDRACSALSLELVHLLEN 17 9 
120 ■ 140 • 160 



18 0 SREASAALLAPGRKGTRVPPLRTPSVAYSVEFYGGHKVDVSLCLINDIEILMKRINSVFY 2 3 9 
180 • 200 ■ 220 



2 40 CMSHTMGLESLERALDLLGRFRGVSPIPDPRLYITSVPCWRCVGELMVLPNHGNPSTAEG 2 99 
240 * 260 • 280 



300 THVSCNHLAVPVNPEPVSGLFENEVRQAGLGHLLEAEEKARPGGPEEGAVPGPGRPEAEG 35 9 
300 ■ 320 - 340 



360 ATRALDTYNVFSTVPPEVAELSELLYWNSGGHAIGATGQGEGGGHSRLSALFARERRLAL 419 
360 • 380 ■ 400 



420 VRGACEEALAGARLTHLFDAVAPGATERLFCGGVYSSSGDAVEALKADCAAAFTAHPQYR 479 
420 • 440 • 460 



4 8 0 AILQKRNELYTRLNRAMQRLGRGEEEASRESPEVPRPAGAREPGPSGALSDALKRKEQYL 5 3 9 
480 • 500 • 520 



5 4 0 RQVATEGLAKLQSCLAQQSETLTETLCLRVWGDVVYWELARMRNHFLYRRAFVSGPWEDR 5 9 9 
540 • 560 • 580 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et 



600 RAGEGAAFEN 
600 

% Identity = 4.9 (30/610) % Honology = 

/// 



al. BA-LF3.xprt => Prt*ratf>7AIighafcft8 AM Page 2 

609 

2.0 (12/610) % Total = 6.9 (42/610) 



### DNA Strider 1.4f6 ##* Wednesday, July 18, 2007 11:33:56 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BA-LF3.xprt => Protein Alignment 

Protein sequence 346 aa MLSGNAGEGATA . . . FCEELLNKRVA* 

Protein sequence 609 aa * GRRGVLIGPLL ... DRRAGEGAAFEN 

„ . _ _ tWMTnM „, Alignment 12 . Comparison of the amino acid sequence 

Method: Diagonals (BLOSUM62) & r _ ~L _ 

Layout: standard encoded by the nucleotide sequence of SEQ ID NO:3 

Block Length <: 6-aa (SEQ ID NO:4) with the amino acid sequence, BA-LF3, 

Mismatch penalty: smaller (l) encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) J no 

Display: BL0SUM62 

20 40 
1 MLSGNAGEGATAC-GGSAAAGQ-D — LIS — VPRNTFMTLLQTNLDNKP — PRQTPLPYA 52 

G GG DLV L + + PQ 

1 *GRRGVLIGPLLRPGGQRPRNPGDHCLQRDRVDGGGRSGLSGGSVGWARGSPGQLYQRGR 6 0 

20 • 40 60 

60 ■ 80 100 

53 APLPPFSHQAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPH 112 

A + A A G ASP G LA + 

61 AGQHRVAGLGRAAAGWRRAGLRGQTQAPSGHRSPRTRGLETPGAAMSGLLAAAYSQVYAL 12 0 

80 ■ 100 • 120 

120 • 140 ■ 160 

113 PHPPPAYFGL-PG-LFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPG 17 0 

LP L +LA+PRNR E L 

121 AVELSVCTRLDPRSLDVAAVVRNAGLLAELEAILLPRLRRQNDRACSALSLELVHLLENS 180 

140 • 160 • 180 

180 ' 200 • 220 

171 EDATLYRKDIAGLSKSVNELQHTLQALRRETLSYGHTGVGYCPQQGPCYTHSGPYGFQPH 230 

+A+ V L+ A E V C 

181 REASAALLAPGRKGTRVPPLRTPSVAYSVEFYGGHKVDVSLCLINDIEILMKRINSVFYC 2 4 0 

200 • 220 • 240 

.240 • 260 • 280 

2 31 QSYEVPRYVPHPPPPPTSHQAAQAQPPPPG TQAPEAHCVAESTI-PEAGAAGNS-GP 2 85 

S++ +PP TPCVE+PG +G 

241 MSHTMGLESLERALDLLGRFRGVSPIPDPRLYITSVPCWRCVGELMVLPNHGNPSTAEGT 300 

260 • 280 • 300 

300 • 320 • 340 

2 86 REDTNPQQ-PTTEGHHRGKKL — V-QASASGVAQSKEPTTPKAKS VSAHLKS IFCEELLN 341 

N P G V QA + + ++E P A E 

301 HVSCNHLAVPVNPEPVSGLFENEVRQAGLGHLLEAEEKARPGGPEEGAVPGPGRPEAEGA 360 

320 ■ 340 • 360 

342 KRVA* 34 6 

R 

361 TRALDTYNVFSTVPPEVAELSELLYWNSGGHAIGATGQGEGGGHSRLSALFARERRLALV 420 

380 ■ 400 • 420 

421 RGACEEALAGARLTHLFDAVAPGATERLFCGGVYSSSGDAVEALKADCAAAFTAHPQYRA 480 

440 • 460 • 480 

4 81 ILQKRNELYTRLNRAMQRLGRGEEEASRESPEVPRPAGAREPGPSGALSDALKRKEQYLR 5 4 0 

500 • 520 • 540 

541 QVATEGLAKLQSCLAQQSETLTETLCLRVWGDVVYWELARMRNHFLYRRAFVSGPWEDRR 600 

560 - 580 • 600 



i 



9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BA-LF3.xprt => Prt^rt)7Alifehaaa56 AM Page 2 



601 AGEGAAFEN 60 9 

% Identity = 9.7 (59/609) % Homology = 3.0 (18/609) % Total = 12.6 (77/609) 



/// 



### DNA Strider 1.4f6 ### * Wednesday, July 18, 2007 11:34:47 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 5.xprt x Bankier et al. BA-LF3.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 609 aa * GRRGVLIGPLL ... DRRAGEGAAFEN 

Method: Diagonals (blosu^62) Alignment 13 . Comparison of the amino acid sequence of 

BiScfLngth <: e-S^ SE Q 10 N0:5 with ±c amino acid sec l uence > BA-LF3, encoded 

Mismatch penalty: Smaller (l) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 



1 *GRRGVLIGPLLRPGGQRPRNPGDHCLQRDRVDGGGRSGLSGGSVGWARGSPGQLYQRGR 6 0 

20 -40 • 60 



61 AGQHRVAGLGRAAAGWRRAGLRGQTQAPSGHRSPRTRGLETPGAAMSGLLAAAYSQVYAL 12 0 

80 • 100 • . 120 



121 AVELSVCTRLDPRSLDVAAVVRNAGLLAELEAILLPRLRRQNDRACSALSLELVHLLENS 180 

140 • 160 • 180 



181 REASAALLAPGRKGTRVPPLRTPSVAYSVEFYGGHKVDVSLCLINDIEILMKRINSVFYC 2 40 

200 • 220 • 240 



241 MSHTMGLESLERALDLLGRFRGVSPIPDPRLYITSVPCWRCVGELMVLPNHGNPSTAEGT 300 

260 • 280 • 300 



301 HVSCNHLAVPVNPEPVSGLFENEVRQAGLGHLLEAEEKARPGGPEEGAVPGPGRPEAEGA 3 60 

320 • 340 • 360 



3 61 TRALDTYNVFSTVPPE VAELSELLYWNSGGHAIGATGQGEGGGHSRLSALFARERRLALV 4 2 0 

380 • 400 • 420 



421 RGACEEALAGARLTHLFDAVAPGATERLFCGGVYSSSGDAVEALKADCAAAFTAHPQYRA 480 

440 • 460 • 480 



4 81 ILQKRNELYTRLNRAMQRLGRGEEEASRESPEVPRPAGAREPGPSGALSDALKRKEQYLR 54 0 

500 • 520 • 540 

1 AVDTGSGGGGQP HDT 15 

G 

541 QVATEGLAKLQSCLAQQSETLTETLCLRVWGDVVYWELARMRNHFLYRRAFVSGPWEDRR 600 

560 • 580 • 600 



9310-13DVCTDV SEQ ID NO 5.xprt x Bankier et al. BA-LF3.xprt => Prt^fl)7Alifcha*fei*ff AM Page 2 

20 

16 APRGARKKQ 2 4 

A GA + 

601 AGEGAAFEN 6 09 



% Identity = 

/// 



0.7 (4/609) 



% Homology = 0.2 (1/609) 



% Total - 0.8 (5/609) 



### DNA Strider 1.4f6 ##'# Wednesday, July 18, 2007 11:35:15 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BA-LF3.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV . . LRAATSGATAAA 

Protein sequence 609 aa * GRRGVLIGPLL ... DRRAGEGAAFEN 

Method: Diagonals (BLOSUM62 ) Alignment 14 . Comparison of the amino acid sequence of 

Layout: Standard pr ^ TT> r ., * -i ~ * — , , 

Block Length <: 6-aa SE Q ID NO: 6 with the amino acid sequence, BA-LF3, encoded 

Mismatch penalty: smaller (l) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 



1 *GRRGVLIGPLLRPGGQRPRNPGDHCLQRDRVDGGGRSGLSGGSVGWARGSPGQLYQRGR 6 0 

20 • 40 60 



61 AGQHRVAGLGRAAAGWRRAGLRGQTQAPSGHRSPRTRGLETPGAAMSGLLAAAYSQVYAL 12 0 

80 • 100 • 120 



121 AVELSVCTRLDPRSLDVAAVVRNAGLLAELEAILLPRLRRQNDRACSALSLELVHLLENS 180 

140 • 160 • 180 



181 REASAALLAPGRKGTRVPPLRTPSVAYS VEFYGGHKVDVSLCLINDIE ILMKRINS VFYC 2 4 0 

200 • 220 ■ 240 



241 MSHTMGLESLERALDLLGRFRGVSPIPDPRLYITSVPCWRCVGELMVLPNHGNPSTAEGT 3 00 

260 ■ 280 ■ 300 



301 HVSCNHLAVPVNPEPVSGLFENEVRQAGLGHLLEAEEKARPGGPEEGAVPGPGRPEAEGA 3 60 

320 • 340 ■ 360 



3 61 TRALDTYNVFSTVPPEVAELSELLYWNSGGHAIGATGQGEGGGHSRLSALFARERRLALV 4 2 0 

380 • 400 • 420 



421 RGACEEALAGARLTHLFDAVAPGATERLFCGGVYSSSGDAVEALKADCAAAFTAHPQYRA 4 80 

440 « 460 ■ 480 



481 ILQKRNELYTRLNRAMQRLGRGEEEASRESPEVPRPAGAREPGPSGALSDALKRKEQYLR 54 0 

500 • 520 • 540 

20 

1 STAVAQSATPSVSSSISSLRA 21 

+ S R 

541 QVATEGLAKLQSCLAQQSETLTETLCLRVWGDVVYWELARMRNHFLYRRAFVSGPWEDRR 600 

560 ■ 580 • 600 



9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BA-LF3.xprt => P 'riMWOT AIi£hafed6 AM Page 2 

2 2 ATS GAT AAA 3 0 

A GA 

601 AGEGAAFEN 609 



% Identity = 

/// 



0.8 (5/609) 



% Homology = 0.2 (1/609) 



% Total = 1.0 (6/609) 



■ • 

### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:27:42 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BA-LF2.xprt => Protein Alignment 

Protein sequence 177 aa MARRLPKPTLQG . . . DTAPRGARKKQ* 

Protein sequence 1129 aa MQGAQTSEDNLG . . . RLATVLPGLEV* 



Method: Diagonals (BLOSUM62) 

Layout : Standard 

Block Length s: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BLOSUM62 



Alignment 15. Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO: 1 
(SEQ ID NO:2) with the amino acid sequence, BA-LF2, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et 



1 MQGAQTSEDNLGSQSQPGPCGYI YFYPLATYPLREVATLGTGYAGHRCLTVPLLCGITVE 6 0 

20 40 - 60 



61 PGFS INVKALHRRPDPNCGLLRATS YHRDIYVFHNAHMVPPIFEGPGLEALCGETREVFG 12 0 

80 • 100 • 120 



121 YDAYSALPRESSKPGDFFPEGLDPSAYLGAVAITEAFKERLYSGNLVAIPSLKQEVAVGQ 180 

140 • 160 ■ 180 



181 SAS VRVPLYDKEVFPEGVPQLRQFYNSDLSRCMHEALYTGLAQALRVRRVGKLVELLEKQ 2 4 0 

200 • 220 • 240 



24 1 SLQDQAKVAKVAPLKEFPASTISHPDSGALMIVDSAACELAVSYAPAMLEASHETPASLN 300 

260 • 280 • 300 



301 YDSWPLFADCEGPE ARVAALHRYNASLAPHVSTQIFATNSVLYVSGVSKSTGQGKESLFN 360 

320 ■ 340 • 360 



361 SFYMTHGLGTLQEGTWDPCRRPCFSGWGGPDVTGTNGPGNYAVEHLVYAASFSPNLLARY 42 0 

380 • 400 • 420 



421 AYYLQFCQGQKSSLTPVPETGSYVAGAAASPMCSLCEGRAPAVCLNTLFFRLRDRFPPVM 480 

440 ■ 460 • 480 



481 STQRRDPYVISGASGS YNETDFLGNFLNFIDKEDDGQRPDDEPRYTYWQLNQNLLERLSR 54 0 

500 ■ 520 • 540 



54 1 LGIDAEGKLEKEPHGPRDFVKMFKDVDAAVDAE VVQFMNSMAKNNITYKDLVKSC YHVMQ 600 

560 ■ 580 • 600 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BA-LF2.xprt => Prt^07AIi£haiferte AM Page 2 



601 YSCNPFAQPACPIFTQLFYRSLLTILQDISLPICMCYENDNPGLGQSPPEWLKGHYQTLC 660 

620 • 640 • 660 



6 61 TNFRSLAIDKGVLTAKEAKVVHGEPTCDLPDLDAALQGRVYGRRLPVRMSKVLMLCPRNI 7 2 0 

680 • 700 • 720 



721 KIKNRVVFTGENAALQNSFIKSTTRRENYI INGPYMKFLNTYHKTLFPDTKLSSLYLWHN 78 0 

740 • 760 • 780 



781 FSRRRSVPVPSGASAEEYSDLALFVDGGSRAHEESNVIDVVPGNLVTYAKQRLNNAILKA 840 

800 • 820 • 840 



841 CGQTQFYISLIQGLVPRTQSVPARDYPHVLGTRAVESAAAYAEATSSLTATTVVCAATDC 900 

860 • 880 • 900 

1 MARRLPKP 8 

xxxxxxxx 

9 01 LSQVCKARP VVTLPVTINKYTGVNGNNQIFQAGNLGYFMGRGVDRNLLQAPGAGLRKQAG 9 6 0 

920 • 940 • 960 

20 ■ 40 60 

9 TLQGRLEADFPDSPLLPKFQELNQNNLPNDVFREAQRSYLVFLTSQFCYEEYVQRTFGVP 68 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
961 GSSMRKKFVFATPTLGLTVKRRTQAATTYEIENIRAGLEAI ISQKQEEDCVFDVVCNLVD 102 0 

980 • 1000 • 1020 

80 • 100 • 120 

6 9 RRQRAIDKRQRASVAGAGAHAHLGGSSATPVQQAQAAASAGTGALASS APSTAVAQSATP 12 8 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
1021 AMGEACASLTRDDAEYLLGRFSVLADSVLETLATIASSGIEWTAEAARDFLEGVWGGPGA 108 0 

1040 • 1060 ■ 1080 

140 • 160 

12 9 S VSSS ISSLRAATSGATAAASAAAAVDTGSGGGGQPHDTAPRGARKKQ* 177 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
10 81 AQDNFISVAEPVSTASQASAGLLLGGGGQGSGGRRKRRLATVLPGLE V* 112 9 

1100 • 1120 

% Identity = 0.0 (0/1129) % Homology = 0.0 (0/1129) % Total = 0.0 (0/1129) 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:29:06 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BA-LF2.xprt => Protein Alignment 

Protein sequence 346 aa ML SGN AGE GAT A . . . FCEELLNKRVA* 

Protein sequence 1129 aa MQGAQTSEDNLG . . . RLATVLPGLEV* 

Method: Diagonals (BLOSUM62) A1 . 4.1*^ 

Layout: standard Alignment 16. Comparison of the amino acid sequence 

Block Length £: 6-aa encoded by the nucleotide sequence of SEQ ID NO* 3 

srsssr"' set % {SE y™v wi ? "<> acid »*»"* ba - lr . 

Display: blosum62 encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 



1 MQGAQTSEDNLGSQSQPGPCGYIYFYPLATYPLREVATLGTGYAGHRCLTVPLLCGITVE 6 0 

20 • 40 60 



61 PGFSINVKALHRRPDPNCGLLRATSYHRDIYVFHNAHMVPPIFEGPGLEALCGETREVFG 120 

80 • 100 • 120 



121 YDAYSALPRESSKPGDFFPEGLDPSAYLGAVAITEAFKERLYSGNLVAIPSLKQEVAVGQ 180 

140 • 160 ■ 180 



181 SASVRVPLYDKE VFPEGVPQLRQFYNSDLSRCMHEALYTGLAQALRVRRVGKLVELLEKQ 2 4 0 

200 • 220 ■ 240 



241 SLQDQAKVAKVAPLKEFPASTISHPDSGALMIVDSAACELAVSYAPAMLEASHETPASLN 300 

260 • 280 • 300 



301 YDSWPLFADCEGPEARVAALHRYNASLAPHVSTQIFATNSVLYVSGVSKSTGQGKESLFN 360 

320 ■ 340 • 360 



361 SFYMTHGLGTLQEGTWDPCRRPCFSGWGGPDVTGTNGPGNYAVEHLVYAASFSPNLLARY 42 0 

380 • 400 • 420 



4 21 AYYLQFCQGQKSSLTPVPETGSYVAGAAASPMCSLCEGRAPAVCLNTLFFRLRDRFPPVM 4 8 0 

440 • 460 • 480 



481 STQRRDPYVISGASGSYNETDFLGNFLNFIDKEDDGQRPDDEPRYTYWQLNQNLLERLSR 54 0 

500 * 520 • 540 



54 1 LGIDAEGKLEKEPHGPRDFVKMFKDVDAAVDAEVVQFMNSMAKNNITYKDLVKSCYHVMQ 600 

560 • 580 • 600 



9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BA-LF2.xprt => Prt/Mrt)7Alighflfc*6 AM Page 2 



601 YSCNPFAQPACPIFTQLFYRSLLTILQDISLPICMCYENDNPGLGQSPPEWLKGHYQTLC 660 

620 • 640 • 660 



661 TNFRSLAIDKGVLTAKEAKVVHGEPTCDLPDLDAALQGRVYGRRLPVRMSKVLMLCPRNI 72 0 

680 • 700 ■ 720 



72 1 KIKNRVVFTGENAALQNSFIKSTTRRENYIINGPYMKFLNTYHKTLFPDTKLSSLYLWHN 780 

740 ■ 760 ■ 780 

20 • 40 

1 MLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPP 57 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
7 81 FSRRRSVPVPSGASAEEYSDLALFVDGGSRAHEESNVIDVVPGNLVTYAKQRLNNAILKA 84 0 

800 ■ 820 • 840 

60 • 80 100 

5 8 FSHQAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPP 117 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
841 CGQTQFYISLIQGLVPRTQSVPARDYPHVLGTRAVESAAAYAEATSSLTATTVVCAATDC 9 00 

860 • 880 • 900 

120 • 140 • 160 

118 AYFGLPGLFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYR 17 7 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
901 LSQVCKARPVVTLPVTINKYTGVNGNNQIFQAGNLGYFMGRGVDRNLLQAPGAGLRKQAG 9 60 

920 ■ 940 • 960 

180 • 200 ■ 220 

178 KDIAGLSKSVNELQHTLQALRRETLSYGHTGVGYCPQQGPCYTHSGPYGFQPHQSYEVPR 2 37 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
961 GSSMRKKFVFATPTLGLTVKRRTQAATTYEIENIRAGLEAI ISQKQEEDCVFDVVCNLVD 102 0 

980 ' 1000 ■ 1020 

240 • 260 • 280 

2 38 YVPHPPPPPTSHQAAQAQPPPPGTQAPEAHCVAESTIPEAGAAGNSGPREDTNPQQPTTE 2 97 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
1021 AMGEACASLTRDDAEYLLGRFSVLADS VLETLATIASSGIEWTAEAARDFLEGVWGGPGA 1080 

1040 • 1060 ■ 1080 

300 • 320 • 340 

298 GHHRGKKLVQASASGVAQSKEPTTPKAKS VSAHLKS IFCEELLNKRVA* 346 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
1081 AQDNFI S VAEPVSTASQASAGLLLGGGGQGSGGRRKRRLATVLPGLE V* 112 9 

1100 • 1120 

% Identity = 0.0 (0/1129) % Homology = 0.0 (0/1129) % Total = 0.0 (0/1129) 

/// 



### DNA Strider 1.4f6 #»# Wednesday, July 18, 2007 11:31:03 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO S.xprt x Bankier et al. BA-LF2.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 1129 aa MQGAQTSEDNLG . . . RLATVLPGLEV* 

Method: Diagonals (BLOSUM62) Ar „ - 

Layout: standard Alignment 17. Comparison of the amino acid sequence of 

Block Length <: 6-aa SEQ ID NO:5 with the amino acid sequence, BA-LF2, encoded 

S^nai5: alty: 2d£T (2) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Display: BL0SUM62 



1 MQGAQTSEDNLGSQSQPGPCGYI YFYPLATYPLREVATLGTGYAGHRCLTVPLLCGITVE 6 0 

20 -40 • 60 



61 PGFSINVKALHRRPDPNCGLLRATSYHRDIYVFHNAHMVPPIFEGPGLEALCGETREVFG 12 0 

80 • 100 • 120 



121 YDAYSALPRESSKPGDFFPEGLDPSAYLGAVAITEAFKERLYSGNLVAIPSLKQEVAVGQ 180 

140 • 160 • 180 



181 SAS VRVPLYDKEVFPEGVPQLRQFYNS DLSRCMHEALYTGLAQALRVRRVGKLVELLEKQ 2 4 0 

200 • 220 • 240 



241 SLQDQAKVAKVAPLKEFPASTISHPDSGALMIVDSAACELAVS YAPAMLE ASHETPASLN 300 

260 • 280 • 300 



301 YDSWPLFADCEGPEARVAALHRYNASLAPHVSTQIFATNSVLYVSGVSKSTGQGKESLFN 360 

320 - 340 ■ 360 



361 SFYMTHGLGTLQEGTWDPCRRPCFSGWGGPDVTGTNGPGNYAVEHLVYAASFSPNLLARY 42 0 

380 • 400 • 420 



4 21 AYYLQFCQGQKSSLTPVPETGS YVAGAAASPMCSLCEGRAPAVCLNTLFFRLRDRFPPVM 4 8 0 

440 • 460 • 480 



481 STQRRDPYVISGASGSYNETDFLGNFLNFIDKEDDGQRPDDEPRYTYWQLNQNLLERLSR 540 

500 • 520 • 540 



541 LGIDAEGKLEKEPHGPRDFVKMFKDVDAAVDAE VVQFMNSMAKNN ITYKDLVKSCYHVMQ 600 

560 • 580 ■ 600 



9310-13DVCTDV SEQ ID NO 5.xf>rt x Bankier et al. BA-LF2.xprt => PrtWWjf07AIi6hdife«8 AM Page 2 



601 YSCNPFAQPACPIFTQLFYRSLLTILQDISLPICMCYENDNPGLGQSPPEWLKGHYQTLC 660 

620 • 640 • 660 



661 TNFRSLAIDKGVLTAKEAKVVHGEPTCDLPDLDAALQGRVYGRRLPVRMSKVLMLCPRNI 72 0 

680 • 700 • 720 



721 KIKNRVVFTGENAALQNSFIKSTTRRENYI INGPYMKFLNTYHKTLFPDTKLSSLYLWHN 780 

740 • 760 • 780 



781 FSRRRSVPVPSGASAEEYSDLALFVDGGSRAHEESNVIDVVPGNLVTYAKQRLNNAILKA 84 0 

800 • 820 • 840 



841 CGQTQFYISLIQGLVPRTQSVPARDYPHVLGTRAVESAAAYAEATSSLTATTVVCAATDC 900 

860 • 880 • 900 



901 LSQVCKARPVVTLPVTINKYTGVNGNNQIFQAGNLGYFMGRGVDRNLLQAPGAGLRKQAG 96 0 

920 • 940 • 960 



961 GSSMRKKFVFATPTLGLTVKRRTQAATTYE IENIRAGLEAIISQKQEEDCVFDVVCNLVD 102 0 

980 • 1000 • 1020 



1021 AMGEACASLTRDDAEYLLGRFSVLADSVLETLATIASSGIEWTAEAARDFLEGVWGGPGA 1080 

1040 • 1060 • 1080 

20 

1 AVDTGSGGGGQPHDTAPRGARKKQ 2 4 

xxxxxxxxxxxxxxxxxxxxxxxx 
10 81 AQDNF I S VAEPVSTASQAS AGLLLGGGGQGSGGRRKRRLATVLPGLE V* 112 9 

1100 • 1120 

% Identity = 0.0 (0/1129) % Homology = 0.0 (0/1129) % Total = 0.0 (0/1129) 

/// 



### DNA Stricter 1.4f6 ### Wednesday, July 18, 2007 11:31:24 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BA-LF2.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV . . . LRAATSGATAAA 

Protein sequence 1129 aa MQGAQTSEDNLG . . . RLATVLPGLEV* 

Alignment 18 . Comparison of the amino acid sequence of 
SEQ ID NO: 6 with the amino acid sequence, BA-LF2, encoded 
by the nucleotide sequence of Fig. 2 of Bankier et al. 



1 MQGAQTSEDNLGSQSQPGPCGYIYFYPLATYPLREVATLGTGYAGHRCLTVPLLCGITVE 6 0 

20 • 40 60 



61 PGFS INVKALHRRPDPNCGLLRATS YHRDIYVFHNAHMVPPIFEGPGLEALCGETREVFG 12 0 

80 • 100 • 120 



121 YDAYSALPRESSKPGDFFPEGLDPSAYLGAVAITEAFKERLYSGNLVAIPSLKQEVAVGQ 180 

140 • 160 • 180 



181 SASVRVPLYDKEVFPEGVPQLRQFYNSDLSRCMHEALYTGLAQALRVRRVGKLVELLEKQ 24 0 

200 • 220 • 240 



241 SLQDQAKVAKVAPLKEFPASTISHPDSGALMIVDSAACELAVSYAPAMLEASHETPASLN 30 0 

260 ■ 280 • 300 



301 YDSWPLFADCEGPEARVAALHRYNASLAPHVSTQIFATNSVLYVSGVSKSTGQGKESLFN 360 

320 ' 340 • 360 



361 SFYMTHGLGTLQEGTWDPCRRPCFSGWGGPDVTGTNGPGNYAVEHLVYAASFSPNLLARY 42 0 

380 • 400 • 420 



4 21 AYYLQFCQGQKSSLTPVPETGSYVAGAAASPMCSLCEGRAPAVCLNTLFFRLRDRFPPVM 4 8 0 

440 • 460 • 480 



481 STQRRDPYVTSGASGS YNETDFLGNFLNFIDKEDDGQRPDDEPRYTYWQLNQNLLERLSR 54 0 

500 • 520 ■ 540 



Method: Diagonals (BL0SUM62) 

Layout : Standard 

Block Length <: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BL0SUM62 



541 LGIDAEGKLEKEPHGPRDFVKMFKDVDAAVDAE VVQFMNSMAKNNITYKDLVKSCYHVMQ 600 

560 • 580 - 600 



9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BA-LF2.xprt => Pi'MS/OTAIifehffifefltl AM Page 2 



601 YSCNPFAQPACPIFTQLFYRSLLTILQDISLPICMCYENDNPGLGQSPPEWLKGHYQTLC 6 60 

620 • 640 • 660 



661 TNFRSLAIDKGVLTAKEAKVVHGEPTCDLPDLDAALQGRVYGRRLPVRMSKVLMLCPRNI 72 0 

680 • 700 • 720 



721 KIKNRVVFTGENAALQNSFIKSTTRRENYIINGPYMKFLNTYHKTLFPDTKLSSLYLWHN 780 

740 ■ 760 • 780 



781 FSRRRSVPVPSGASAEEYSDLALFVDGGSRAHEESNVID VVPGNLVTYAKQRLNNAILKA 8 40 

800 • 820 • 840 



8 41 CGQTQFYISLIQGLVPRTQS VPARDYPHVLGTRAVES AAAYAEATSSLTATTVVCAATDC 9 0 0 

860 ■ 880 • 900 



9 01 LSQVCKARP VVTLPVTINKYTGVNGNNQIFQAGNLGYFMGRGVDRNLLQAPGAGLRKQAG 9 6 0 

920 • 940 • 960 



961 GSSMRKKFVFATPTLGLTVKRRTQAATTYEIENIRAGLEAIISQKQEEDCVFDVVCNLVD 102 0 

980 * 1000 • 1020 



1021 AMGEACASLTRDDAEYLLGRFSVLADSVLETLATIASSGIEWTAEAARDFLEGVWGGPGA 1080 

1040 • 1060 • 1080 

20 

1 STAVAQSATPS VS SS ISSLRAATSGATAAA 3 0 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
10 81 AQDNFIS VAEPVSTASQASAGLLLGGGGQGSGGRRKRRLATVLPGLEV* 112 9 

1100 ■ .1120 

% Identity = 0.0 (0/1129) % Homology = 0.0 (0/1129) % Total = 0.0 (0/1129) 

/// 



### DNA Stricter L4f6 ### Wednesday, July 18, 2007 11:21:19 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BA-LFl.xprt => Protein Alignment 

Protein sequence 177 aa MARRLPKPTLQG . ... DTAPRGARKKQ* 

Protein sequence 221 aa MNLAIALDSPHP . . . LASLTDFLKSL* \ 

Method: Diagonals (BLOSUM62) Alignment 19 . Comparison of the amino acid sequence 

Layout : standard encoded by the nucleotide sequence of SEQ ID NO: 1 

SStch^alty : seller ( l ) < SE Q ID NO:2 ) with the a™ 110 acid sequence, BA-LF1 , 

Gap penalty: Medium ( 2 ) encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 

Display: BL0SUM62 

20 • 40-60 

1 MARRLPKPTLQGRLEADFPDSPLLPKFQELNQNNLPNDVFREAQRSYLVFLTSQFCYEEY 6 0 

M + + LA+P L+ P + + R A+ + VF+ + E + 

1 MNLAIALDSPHPGL=ASYTILPRPFYHISLKPVSWPDETMRPAKSTDSVFVRTPV==EAW 57 

20 -40 
80 • 100 

61 VQRTFGVPRRQRAIDKRQRASVAGAGAHAHLGGSSATPVQQAQAAASAGTGAL-ASSAPS 119 

V++ + RAA A++A+ LA A 

5 8 VAPSPPDDKVAESS YLMFRAMYAVFTRDEKDLPLPALVLCRLIKASLRKDRKLYAELACR 117 
60 • 80 100 

120 • 140 • 160 

12 0 TAVAQSATPSVSSS ISSLRAATSGATAAASAAAAVD — TGSGGGGQPH DTAPRGARKKQ* 177 

TA V IS LRA + S V T DA 

118 TADIGGKDTHVRLI IS VLRAVYNDHYDYWSRLRVVLCYTVVFAVRNYLDDHKS AAFVLGA 17 7 
120 ■ 140 • 160 



17 8 IAHYLALYRRLWFARLGGMPRSLRRQFPVTWALASLTDFLKSL* 221 
180 • 200 • 220 

% Identity = 14.7 (33/224) % Homology = 8.0 (18/224) % Total = 22.8 (51/224) 



/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:22:01 AM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BA-LFl.xprt => Protein Alignment 

Protein sequence 346 aa MLSGNAGEGATA . . . FCEELLNKRVA* 

Protein sequence 221 aa MNLAIALDSPHP . . . LASLTDFLKSL* 

Method: Diagonals (blosum62) Alignment 20 . Comparison of the amino acid sequence 

BiSkLen h < rS^^ encoded by the nucleotide sequence of SEQ ID NO: 3 

MisStch n Snalty: seller (1) ( SE Q 10 N0:4 ) with the a™ 110 acid sequence, BA-LF1, 

Gap penalty: Medium ( 2 ) encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 

Display: BLOSUM62 

20 «40 • 60 

1 MLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFSH 6 0 

M A+ GA+ + + + +P + T+ P + 

1 MNLAIALDSPHP=GL=ASYTILPRPFYHISLKPVSWPDETMRPAKSTDSVFVRT=PVEAW 5 7 

20 ■ 40 

80 • 100 • 120 

61 QAI ATAPS YGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAYF 12 0 

A + + + Y P L+ +A 

5 8 VAPSPPDDKVAESS YLMFRAMYAVFTRDEKDLPLPALVLCRLIKASLRKDRKLYAELACR 117 
60 • 80 100 

140 • 160 ■ 180 

121 GLPGLFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKDI 18 0 

G L S LRA Y + R R + D + 

118 TADIG=GKDTHVRLI I = S VLRAVYNDHYDYWS =RLRVVLCYTVVFAVRNYLDDHKSAAFV 174 
120 • 140 • 160 

200 • 220 • 240 

181 AGLSKSVNELQHTLQALRRETLS YGHTGVGYCPQQGPCYTHSGPYGFQPHQSYEVPRYVP 2 4 0 

G L L R + T 

175 LGAIAHYLALYRRLWFARLGGMPRSLRRQFPVTWALASLTDFLKSL* = ======= = = = ==== 22 1 

180 • 200 ■ 220 

260 • 280 • 300 

24 1 HPPPPPTSHQAAQAQPPPPGTQAPEAHCVAESTIPEAGAAGNSGPREDTNPQQPTTEGHH 300 

===================r=s===s===================:r===========sss 

320 ■ 340 

301 RGKKLVQASASGVAQSKEPTTPKAKSVSAHLKS I FCEELLNKRVA* 346 



% Identity = 7.8 (27/346) 

/// 



% Homology = 5.2 (18/346) 



% Total = 13.0 (45/346) 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:23:10 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO S.xprt x Bankier et al. BA-LFl.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 221 aa MNLAIALDSPHP . . . LASLTDFLKSL* 

Method : Diagonals ( blosum62 ) Alignment 21 . Comparison of the amino acid sequence of 

BiockLength <: 6-aa " SE Q ID N0:5 with the amin0 acid sequence, BA-LF1, encoded 

Mismatch penalty: Smaller (l) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BLOSUM62 

20 

1 AVDTGSGGGGQPHDTA-PRGARKKQ 2 4 

A+ S G T PR 

1 MNLAIALDSPHPGLASYTILPRPFYHISLKPVSWPDETMRPAKSTDSVFVRTPVEAWVAP 6 0 

20 • 40 60 



61 SPPDDKVAESSYLMFRAMYAVFTRDEKDLPLPALVLCRLIKASLRKDRKLYAELACRTAD 12 0 

80 • 100 • 120 



121 IGGKDTHVRLIISVLRAVYNDHYDYWSRLRVVLCYTVVFAVRNYLDDHKSAAFVLGAIAH 180 

140 • 160 • 180 



181 YLALYRRLWFARLGGMPRSLRRQFPVTWALASLTDFLKSL* 2 21 

200 ■ 220 

% Identity = 2.7 (6/221) % Homology = 0.5 (1/221) % Total = 3.2 (7/221) 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:23:29 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BA-LFl.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV ... LRAATSGATAAA 

Protein sequence 221 aa MNLAIALDSPHP . . . LASLTDFLKSL* 

Method: Diagonals (blosum6 2) Alignment 22 . Comparison of the amino acid sequence of 

Layout: standard SEQ ID NO:6 with the amino acid sequence, BA-LF1 , encoded 

SStS^y: Seller ( 1) * nUCle ° tide ° f 2 ° f B ^ « * 

Gap penalty: Medium (2) 

Display: BL0SUM62 

20 

1 ST-AVAQ-SATPSVSSS ISSLRAATSGATAAA 30 

A+A S P + + S R + 

1 MNLAIALDSPHPGLAS YTILPRPFYHISLKPVSWPDETMRPAKSTDSVFVRTPVEAWVAP 6 0 

20 • 40 -60 



61 SPPDDKVAESS YLMFRAMYAVFTRDEKDLPLPALVLCRLIKASLRKDRKLYAELACRTAD 12 0 

80 • 100 ■ 120 



121 IGGKDTHVRLIISVLRAVYNDHYDYWSRLRVVLCYTVVFAVRNYLDDHKS AAFVLGAIAH 180 

140 • 160 • 180 



181 YLALYRRLWFARLGGMPRSLRRQFPVTWALASLTDFLKSL* 2 21 

200 • 220 

% Identity = 2.7 (6/221) % Homology = 1.8 (4/221) % Total = 4.5 (10/221) 

/// 



« » 

### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:36:21 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BA-RFl.xprt => Protein Alignment 

Protein sequence 177 aa MARRLPKPTLQG . . . DTAPRGARKKQ* 

Protein sequence 222 aa MARFIAQLLLLA . . . HGVYVSGYLSQ* 



Method: 
Layout : 

Block Length ^: 
Mismatch penalty: 
Gap penalty: 
Display : 



Diagonals (BLOSUM62) 

Standard 

6-aa 

Smaller (1) 
Medium ( 2 ) 
BLOSUM62 



Alignment 23 . Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO: 1 
(SEQ ID NO:2) with the amino acid sequence, Ba-RFl, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 



20 40 
1 MARRLPKPTLQGRLEADFPDSPL-LPKFQELNQNNLPNDVFREAQRSYLVFLTSQFCYEE 59 

MAR + + L A L+L +E + S+ + 

1 MARFIAQLLLLASCVAAGQAVTAFLGERVTLTSYWRRVSLGPE IEVSWFKLGPGEEQVLI 60 

20 • 40 60 

60 • 80 100 

6 0 YVQRTFGVPRRQRAIDKRQRASVAGAGAHAHLGGSSATPVQQAQAAASAGTGALASSAPS 119 

+ A + + T S 

61 GRMHHDVIFIEWPFRGFFDIHRSANTFFLVVTAANISHDGNYLCRMKLGETEVTKQEHLS 12 0 

80 ■ 100 • 120 

120 • 140 • 160 

12 0 TAVAQSATP-SVSSS IS SLRAATSGATAAAS AAAAVDTGSGGGGQPH DTAPRGARKKQ* - 177 

+ +SS TTA V G+PTAGK+ 

121 VVKPLTLSVHSERSQFPDFSVLTVTCTVNAFPHPHVQWLMPEGVEPAPTAANGGVMKEKD 180 

140 • 160 • 180 



181 GSLS VAVDLSLPKPWHLPVTCVGKNDKEEAHGVYVSGYLSQ* 2 2 2 

200 • 220 

% Identity = 10.8 (24/222) % Homology = 6.3 (14/222) % Total = 17.1 (38/222) 



/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 11:37:58 AM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BA-RFl.xprt => Protein Alignment 

Protein sequence 346 aa ML SGN AGE GAT A . . . FCEELLNKRVA* 

Protein sequence 222 aa MARFIAQLLLLA . . . HGVYVSGYLSQ* 



Method: 
Layout : 

Block Length £: 
Mismatch penalty: 
Gap penalty: 
Display: 



Diagonals (BLOSUM62) 

Standard 

6-aa 

Smaller (1) 
Medium ( 2 ) 
BLOSUM62 



Alignment 24 , Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO: 3 
(SEQ ID NO: 4) with the amino acid sequence, Ba-RFl, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 



20 • 40 60 

1 MLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFSH 6 0 



80 • 100 • 120 

61 QAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAYF 12 0 

A F 

1 = = = === = ======= = = = ==== = ======:==== = ===!==== = ===============:=======:=3=MARF 4 

140 • 160 • 180 

121 GLPGLFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKDI 180 

L ++VS+PE+ L PGE+ L + 

.5 IAQLLLLASCVAAGQAVTAFLGERVTLTSYWRRVSLGPEIEVSWFKLGPGEEQVLIGRMH 6 4 
20 -40 • 60 

200 • 220 • 240 

181 AGLSKSVNELQHTLQALRRETLS YGHTGVGYCPQQGPCYTHSGPYGFQPHQSYEVPRYVP 240 
+ + R+ GYG EV 

65 HDVIFIEWPFRGFFDIHRSANTFFLVVTAANISHDG=NYLCRMKLGETEVTKQE=HLSV= 121 
80 • 100 • 120 

260 • 280 • 300 

241 HPPPPPTSHQAAQAQPPPPGTQAPEAHCVAESTIPEAGAAGNSGPREDTNPQQPTTEGHH 3 00 
P S + + + Q P+ VP GEP G 

122 =VKPLTLSVHSERSQ=FPDFSVLTVTCTVNAFPHPHVQWLMPEG=VE=PAP=TAANGGVM 176 

140 • 160 

320 ■ 340 

3 01 RGKKLVQASASGVAQSKEPTTPKAKSVSAHLKS I FCEELLNKRVA* 34 6 

. ; + K +A++K P + + * 

177 KEKDGSLS VAVDLSLPKPWHLPVTCVGKNDKEEAHGVYVSGYLSQ* 222 
180 • 200 • 220 

% Identity = 9.5 (33/346) % Homology = 5.5 (19/346) % Total = 15.0 (52/346) 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:53:27 PM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO S.xprt x Bankier et al. BA-RFl.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 222 aa MARFIAQLLLLA . . . HGVYVSGYLSQ* 

Method: Diagonals (blosum62 ) Ali2nment 25 . Comparison of the amino acid sequence of 

Block^ength <: e-aa^^ ^EQ ^ NO: 5 with the amino acid sequence, Ba-RFl, encoded 

Mismatch penalty: smaller (i) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 



1 MARFIAQLLLLASCVAAGQAVTAFLGERVTLTS YWRRVSLGPE IEVSWFKLGPGEEQVLI 6 0 

20 • 40 60 



61 GRMHHDVIFIEWPFRGFFDIHRSANTFFLVVTAANISHDGNYLCRMKLGETEVTKQEHLS 12 0 

80 • 100- • 120 



121 VVKPLTLSVHSERSQFPDFSVLTVTCTVNAFPHPHVQWLMPEGVEPAPTAANGGVMKEKD 180 

140 • 160 • 180 

20 

1 AVDTGSGGGGQP HDTAPRGARKKQ 2 4 

G + H G + 

181 GSLSVAVDLSLPKPWHLPVTCVGKNDKEEAHGVYVSGYLSQ* 222 

200 • 220 

% Identity = 1.4 (3/222) % Homology = 0.9 (2/222) % Total = 2.3 (5/222) 



/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:53:47 PM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BA-RFl.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV . . . LRAAT S GAT AAA 

Protein sequence 222 aa MARFIAQLLLLA . . . HGVYVSGYLSQ* 

Method: Diagonals (BLOSUM62) Alignment 26 . Comparison of the amino acid sequence of 

Layout: standard SEQ ID NO: 6 with the amino acid sequence, Ba-RFl , encoded 

SStrSn^y; signer (l) by the nucleodde sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 

20 

1 STAV-AQSATPSVS SS ISSLRAATSGATAAA 30 

AQ + + AG 

1 MARFIAQLLLLASCVAAGQAVTAFLGERVTLTS YWRRVSLGPE IE VSWFKLGPGEEQVLI 6 0 

20 • 40 60 



61 GRMHHDVIFIEWPFRGFFDIHRSANTFFLVVTAANISHDGNYLCRMKLGETEVTKQEHLS 12 0 

80 • 100 • 120 



121 VVKPLTLSVHSERSQFPDFSVLTVTCTVNAFPHPHVQWLMPEGVEPAPTAANGGVMKEKD 180 

140 * 160 • 180 



181 GSLSVAVDLSLPKPWHLPVTCVGKNDKEEAHGVYVSGYLSQ* 2 2 2 

200 • 220 

% Identity = 1.8 (4/222) % Homology = 0.9 (2/222) % Total = 2.7 (6/222) 



/// 



### DNA Stricter - 1.416 ### Wednesday, July 18, 2007 1:00:06 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BN-LF2a,b.xprt => Protein Alignment 

. DTAPRGARKKQ* 
. LSLRCELGWCG* 

Alignment 27 . Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO: 1 
(SEQ ID NO: 2) with the amino acid sequence, BN-LF2a,b, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 

40 • 60 

1 MARRLPKPTLQGRLEADFPDSPLLPKFQELNQNNLPNDVFREAQRSYLVFLTSQFCYEEY 60 

ML+L++A PDVRRLVLF 
1 MVHVLERALLEQQSS ACGLPGSSTETRPSHPCPEDP=DVSRL==RLLLVVLCVLFGLLCL 5 7 

20-40 

80 • 100 • 120 

6 1 VQRTFGVPRRQRAIDKRQRASVAGAGAHAHLGGSSATPVQQAQAAASAGTGALASSAPST 12 0 

+ RR+ ++A+ TS 

58 LLI *EATMRPGRPLAGFYATLRRSFRRMSKRSKNKAKKERVPVEDRPP=TPMPTSQRLIR 116 
60 • 80 100 

140 • 160 

121 AVAQSATPSVSSS I SSLRAATSGATAAASAAAAVDTGSGGGGQPH DTAPRGARKKQ* 177 

A + R S D S 

117 RNALGGGVRP DAE DC I QRFHPLEPALGVSTKNF= DLLS LRCELGWCG *========= 163 

120 • 140 • 160 

% Identity = 13.0 (23/177) % Homology = 5.1 (9/177) % Total - 18.1 (32/177) 

/// 



Protein sequence 177 aa MARRLPKPTLQG 

Protein sequence 163 aa MVHVLERALLEQ 



Method: Diagonals (BL0SUM62) 

Layout : Standard 

Block Length <: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BL0SUM62 



20 



### DNA Stricter 1.4f6 ##'# Wednesday, July 18, 2007 1:00:26 PM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BN-LF2a,b.xprt => Protein Alignment 

Protein sequence 346 aa MLSGNAGEGATA . . . FCEELLNKRVA* 

Protein sequence 163 aa MVHVLERALLEQ ... LSLRCELGWCG* 



Method : 
Layout : 

Block Length £: 
Mismatch penalty: 
Gap penalty: 
Display: 



Diagonals (BLOSUM62) 

Standard 

6-aa 

Smaller (1) 
Medium ( 2 ) 
BLOSUM62 

20 



Alignment 28 . Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO: 3 
(SEQ ID NO:4) with the amino acid sequence, BN-LF2a,b, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 



40 



59 



58 



119 



1 MLSGNAGEGATACGGSAAAGQDLISVP-RNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFS 

M+ E A S+A G S R + ++ L L 

1 MVHVL==ERALLEQQSSACGLPGSSTETRPSHPCPEDPDVSRLRLLLVVLCVLFGLLCLL 

20 • 40 

60 • 80 100 

6 0 HQAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAY 

AT PAGA F A + TP 

5 9 LI*EATMRPGRPLAGFYATLRRSFRRMSKRSKNKAKKERVPVEDRPP=TPMPTSQRLIRR 117 
60 • 80 100 

120 • 140 • 160 

12 0 FGLPGLFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKD 179 

LGP + L + + EG 

118 NALGGGVRP DAE DC I QRFHPLEPALGVSTKNF DLL SLRCELGWCG* ==== = ======== = 163 

120 • 140 • 160 

180 • 200 • 220 

180 IAGLSKSVNELQHTLQALRRETLSYGHTGVGYCPQQGPCYTHSGPYGFQPHQSYEVPRYV 2 39 



240 • 260 ■ 280 

240 PHPPPPPTSHQAAQAQPPPPGTQAPEAHCVAESTIPEAGAAGNSGPREDTNPQQPTTEGH 2 99 



300 • 320 ■ - . 340 

3 00 HRGKKLVQASASGVAQSKEPTTPKAKSVSAHLKSIFCEELLNKRVA* 34 6 



% Identity = 

III 



7.5 (26/347) 



% Homology = 2.6 (9/347) 



% Total = 10.1 (35/347) 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 1:01:02 PM 



(US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO S.xprt x Bankier et al. BN-LF2a,b.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 163 aa MVHVLERALLEQ . . . LSLRCELGWCG* 



Method: 
Layout : 

Block Length <: 
Mismatch penalty: 
Gap penalty: 
Display: 



Diagonals (BLOSUM62) 

Standard 

6-aa 

Smaller (1) 
Medium ( 2 ) 
BLOSUM62 

20 



Ali2nment 29 . Comparison of the amino acid sequence of 

SEQ ID NO: 5 with the amino acid sequence, BN-LF2a,b, encoded 

by the nucleotide sequence of Fig. 2 of Bankier et al. 



24 



1 AVDTGSGGGGQPHDTAPRGARKKQ 

V + +A 

1 MVHVLERALLEQQSS ACGLPGSSTETRPSHPCPEDPDVSRLRLLLVVLCVLFGLLCLLLI 6 0 

20 • 40 60 



6 1 *EATMRPGRPLAGFYATLRRSFRRMSKRSKNKAKKERVPVEDRPPTPMPTSQRLIRRNAL 12 0 

80 • 100 • 120 



121 GGGVRPDAEDCIQRFHPLEPALGVSTKNFDLLSLRCELGWCG* 163 

140 • 160 

% Identity = 1.2 (2/163) % Homology = 1.2 (2/163) % Total = 2.5 (4/163) 



/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 1:01:19 PM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BN-LF2a,b.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV . . . LRAATSGATAAA 

Protein sequence 163 aa MVHVLERALLEQ . . . LSLRCELGWCG* 

Method: *f a9 ° na * s < BLOSUM62 > Alignment 30 . Comparison of the amino acid sequence of 

Layout: Standard n r ^ 

Block Length <: 6»aa SEQ ID NO: 6 with the amino acid sequence, BN-LF2a,b, encoded 

Mismatch penalty: smaller (l) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 

20 

1 STAVAQSATPSVSSSISSLRAATSGATAAA 30 

V + A SS L +++ + 

1 MVHVLERALLEQQSSACGLPGSSTETRPSHPCPEDPDVSRLRLLLVVLCVLFGLLCLLLI 6 0 

20 • 40 60 



61 *EATMRPGRPLAGFYATLRRSFRRMSKRSKNKAKKERVPVEDRPPTPMPTSQRLIRRNAL 12 0 

80 • 100 • 120 



121 GGGVRPDAEDCIQRFHPLEPALGVSTKNFDLLSLRCELGWCG* 163 

140 • 160 

% Identity = 3.1 (5/163) % Homology = 3.1 (5/163) % Total = 6.1 (10/163) 



/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:56:55 PM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BN-LFlb.xprt => Protein Alignment 

Protein sequence 177 aa MARRLPKPTLQG . . . DTAPRGARKKQ* 

Protein sequence 269 aa VLGIWIYLLEML . . . PHGPVQLSYYD* 

Alignment 31 . Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO: 1 
(SEQ ID NO:2) with the amino acid sequence, BN-LFlb, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 



Method: Diagonals (BLOSUM62) 

Layout: Standard 

Block Length £: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BL0SUM62 



1 VLGIWI YLLEMLWRLGATIWQLLAFFLAFFLDLILLIIALYLQQNWWTLLVDLLWLLLFL 60 

20 • 40 60 

20 

1 MARRLPKPTLQGRLEADFPDSPLLPKFQ 28 

+ +GR + P 

61 AILIWMYYHGQRHSDEHHHDDSLPHPQQATDDSGHESDSNSNEGRHHLLVSGAGDGPPLC 12 0 

80 • 100 ■ 120 

40-60-80 
2 9 ELNQNNLPNDVFREAQRSYLVFLTSQFCYEEYVQRTFGVPRRQRAIDKRQRASVAGAGAH 8 8 
N Q + P Q D 

121 SQNLGAPGGGPDNGPQDPDNTDDNGPQDPDNTDDNGPHDPLPQ=DPDNTDDNGPQDPDNT 17 9 

140 • 160 

100 ■ 120 • 140 

8 9 AHLGGSSATPVQQAQAAASAGTGALASSAPSTAVAQSATPSVSSSISSLRAATSGATAAA 14 8 
G P++A+GG + P++ +SG 

180 DDNGPHDPLPHSPSDSAGNDG=GPPQLTEEVENKGGDQGPPLMTDGGGGHSHDSGHGGGD 2 38 
180 • 200 • 220 

160 

149 SAAAAVDTG-SGGGGQPHD-TAPRGARKKQ* 177 

+ G SG GG D P * 
239 PHLPTLLLGSSGSGGDDDDPHGPVQLS YYD* 269 
240 • 260 

% Identity » 8.9 (24/271) % Homology = 4.4 (12/271) % Total = 13.3 (36/271) 



/// 



f > 

### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:57:10 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BN-LFlb.xprt => Protein Alignment 

Protein sequence 346 aa MLSGNAGEGATA . . . FCEELLNKRVA* 

Protein sequence 269 aa VLGIWIYLLEML . . . PHGPVQLSYYD* 



Method: Diagonals (BLOSUM62 ) 

Layout: Standard 

Block Length <: 6-aa 

Mismatch penalty: Smaller (1) 

Gap penalty: Medium (2) 

Display: BLOSUM62 



Alignment 32 , Comparison of the amino acid sequence 
encoded by the nucleotide sequence of SEQ ID NO:3 
(SEQ ID NO:4) with the amino acid sequence, BN-LFlb, 
encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 



20 -40 • 60 

1 MLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFSH 6 0 



80 • 100 • 120 

61 QAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAYF 12 0 

V Y + G AF 

1 ======== =======VLG I WIY=LLEMLWRLG ATI WQLLAFFLAFFLDLILLI I ALYLQQ 4 4 

20 ■ 40 

140 • 160 ■ 180 

12 1 GLPGLFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKDI 180 
j L LL + L Y S + + D + 

4 5 NWWTL=LVDLLWLLLFLAILIWMYYHGQRHSDEHHHD=DSLPHPQQATDDSGHESDSNSN 102 

60-80 • 100 

200 • 220 • 240 

181 AGLSKSVNELQHTLQALRRETLSYGHTGVGYCPQQGPCYTHSGPYGFQPHQSYEVPRYVP 2 40 

G + L + L G PQ +GP + P 

103 EGRHHLLVSGAGDGPPLCSQNLGAPGGGPDNGPQDPDNTDDNGPQDPDNTDDNGPHDPLP 162 

120 • 140 • 160 

260 • 280 

241 -HPPPPPTSHQAAQAQPPPPGTQAPEAHCVAE STIPE AG AAGNSGPREDTNPQQPTTEGH 2 99 

P + G P H ++S + G + E+ Q 

163 QDPDNTDDNGPQDPDNTDDNGPHDPLPHSPSDSAGNDGGPPQLTEEVENKGGDQGPPLMT 22 2 

180 • 2 00 • 22 0 

300 • 320 • 340 

30 0 HRGKKLVQASASGVAQSKEPTTPKAKS VS AH LKS I FCEELLNKRVA* 34 6 

G SG PT SS + * 

223 DGGGGHSHDSGHGGGDPHLPTLLLGSSGSGGDDDDPHGPVQLSYYD* 2 69 

240 ■ 260 



% Identity = 10.7 (37/347) 



% Homology = 4.6 (16/347) 



% Total = 15.3 (53/347) 



/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:58:07 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO S.xprt x Bankier et al. BN-LFlb.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 269 aa VLGIWIYLLEML . . . PHGPVQLSYYD* 

ZSS ! stSrd 3 (BLOSUM62 > ■ Alignment 33 . Comparison of the amino acid sequence of 

Block Length <: 6-aa SEQ ID NO: 5 with the amino acid sequence, BN-LFlb, encoded 

Mismatch penalty: smaller (i) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 



1 VLGIWIYLLEMLWRLGATIWQLLAFFLAFFLDLILLIIALYLQQNWWTLLVDLLWLLLFL 6 0 

20 -40 • 60 



61 AILIWMYYHGQRHSDEHHHDDSLPHPQQATDDSGHESDSNSNEGRHHLLVSGAGDGPPLC 12 0 

80 ■ 100 • 120 



121 SQNLGAPGGGPDNGPQDPDNTDDNGPQDPDNTDDNGPHDPLPQDPDNTDDNGPQDPDNTD 18 0 

140 • 160 • 180 



181 DNGPHDPLPHSPSDSAGNDGGPPQLTEEVENKGGDQGPPLMTDGGGGHSHDSGHGGGDPH 240 

200 • 220 • 240 

20 

1 AVDTGSGGGGQP HDTAPRGARKKQ 2 4 

+GSGG + + 

241 LPTLLLGSSGSGGDDDD PHGPVQLSYYD* 269 

260 

% Identity = 1.5 (4/269) % Homology = 1.1 (3/269) % Total - 2.6 (7/269) 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:58:28 PM (US Letter @ 100%) 

9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BN-LFlb.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV . . . LRAATSGATAAA 

Protein sequence 269 aa VLGIWIYLLEML . . . PHGPVQLSYYD* 

Layout- stSdard 3 (BLOSUM62 ] Alignment 34 . Comparison of the amino acid sequence of 

Block Length s : 6-aa SEQ ID NO: 6 with the amino acid sequence, BN-LFlb, encoded 

Mismatch penalty: smaller (l) by the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 



1 VLGIWIYLLEMLWRLGATIWQLLAFFLAFFLDLILLI IALYLQQNWWTLLVDLLWLLLFL 6 0 

20 -40 ■ 60 



61 AILIWMYYHGQRHSDEHHHDDSLPHPQQATDDSGHESDSNSNEGRHHLLVSGAGDGPPLC 12 0 

80 • 100 ■ 120 



12 1 SQNLGAPGGGPDNGPQDPDNTDDNGPQDPDNTDDNGPHDPLPQDPDNTDDNGPQDPDNTD 180 

140 • 160 • 180 

1 ST 2 

181 DNGPHDPLPHSPSDSAGNDGGPPQLTEEVENKGGDQGPPLMTDGGGGHSHDSGHGGGDPH 2 40 

200 • 220 ■ 240 

20 

3 AVAQSATPSVS-SSISSLRAATSGATAAA 30 
S S + 

241 LPTLLLGSSGSGGDDDDPHGPVQLSYYD* 269 

260 

% Identity = 0.7 (2/269) % Homology = 0.4 (1/269) % Total = 1.1 (3/269) 



/// 



I 1 

### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:54:23 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 2.xprt x Bankier et al. BN-LFla.xprt => Protein Alignment 

Protein sequence 177 aa MARRLPKPTLQG . . . DTAPRGARKKQ* 

Protein sequence 144 aa MEHDLERGPPGP . , . LGIVLFIFGCLL 



Method: Diagonals (blosum62) Alignment 35. Comparison of the amino acid sequence 

Layout : standard encoded by the nucleotide sequence of SEQ ID NO: 1 

Block Length s- e-aa (SEQ ID NO:2) with the amino acid sequence, BN-LFla, 

Mismatch penalty: Smaller (1) , - , Jl. ^ * ' 

cap penalty : Medium ( 2 ) encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 

Display: BL0SUM62 

20 * 40 60 

1 MARRLPKPTLQGRLEADFPDSPLLPKFQELNQNNLPNDVFREAQRSYLVFLTSQFCYEEY 6 0 

ML+RP LL+S Y 

1 MEHDLERGPPGPRRPPRGPPLSSSLGLALLLLL=LALLFWLYIVMSDWTGGALLVLYSFA 5 9 

20 • 40 

80 • 100 • 120 

61 VQRTFGVPRRQRAIDKRQRASVAGAGAHAHLGGSSATPVQQAQAAASAGTGALASSAPST 12 0 

+ + I +R GA L T S+ + AL+ +P T 

6 0 LMLIIIILI IF==IFRRDLLCPLGALCILLLMSKYYTLCPTPPFPYSSFSNALSPLSPVT 117 
60 80 100 

140 • 160 

121 A VAQSATPSVSSS IS SLRAATSGATAAASAAAAVDTGSGGGGQPH DTAPRGARKKQ* 17 7 

+ A ++ L 

118 LLLI=ALWNLHGQALFLGIVLFIFGCLL============================- 14 4 

120 • 140 



% Identity = 11,9 (21/177) % Homology = 6.8 (12/177) % Total = 18.6 (33/177) 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:54:42 PM (US Letter @ 100%) 
9310-13DVCTDV SEQ ID NO 4.xprt x Bankier et al. BN-LFla.xprt => Protein Alignment 

Protein sequence 346 aa MLSGNAGEGATA . . . FCEELLNKRVA* 

Protein sequence 144 aa MEHDLERGPPGP . . . LGIVLFIFGCLL 

Method: Diagonals (blosum62) Alignment 36 . Comparison of the amino acid sequence 

Layout : standard encoded by the nucleotide sequence of SEQ ED NO 3 

SStiTSLSy, 2Sier (i) (SEQ ID NO.4) with the amino acid sequence, BN-LF la, 

Gap penalty : Medium ( 2 ) encoded by the nucleotide sequence of Fig. 2 of Bankier et al. 

Display: BL0SUM62 

20 -40 • 60 

1 MLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFSH 6 0 

M G GLS +LLL + AL+S 

1 MEHDLER=GPPGPRRPPR=GPPLSSSLGLALLLLLLALLFWLYIVMSDWTGGALLVLYSF 5 8 

20 ■ 40 

80 ■ 100 • 120 

61 QAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAYF 12 0 

+ + + P 

5 9 ALMLIIIILIIFIFRRDLLCPLGALCILLLMSKYYTLCPTPPFPYSSFSNALSPLSPVTL 118 
60 • 80 100 

140 • 160 • 180 

121 GLPGLFGPPPPCLLTTDSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKDI 180 

L L+ L + 

119 LLIALWNLHGQALFLGIVLFIFGCLL================================== 14 4 

120 • 140 

200 • 220 • 240 

181 AGLSKSVNELQHTLQALRRETLS YGHTGVGYCPQQGPCYTHSGPYGFQPHQS YEVPRYVP 2 4 0 



260 • 280 ■ 300 

241 HPPPPPTSHQAAQAQPPPPGTQAPEAHCVAESTIPEAGAAGNSGPREDTNPQQPTTEGHH 30 0 



320 ■ 340 

301 RGKKLVQAS AS GVAQSKEPTTPKAKSVSAHLKS I FCEELLNKRVA* 346 



% Identity = 4.3 (15/346) % Homology = 2.3 (8/346) % Total = 6.6 (23/346) 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:55:12 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO S.xprt x Bankier et al. BN-LFla.xprt => Protein Alignment 

Protein sequence 24 aa AVDTGSGGGGQP . . . HDTAPRGARKKQ 

Protein sequence 144 aa MEHDLERGPPGP . . . LGIVLFIFGCLL 

Method: Diagonals (blosum62) Alignment 37 . Comparison of the amino acid sequence of 

BiocfLngth <: e-af*^ SEQ 10 NO:5 with the amino acid sec l uence > BN-LFla, encoded 

Mismatch penalty: smaller (i) b Y the nucleotide sequence of Fig. 2 of Bankier et al. 

Gap penalty: Medium (2) 

Display: BL0SUM62 

20 

1 AVDTGSGGGGQPHDTAPRGARKKQ 24 

G P PRG 

1 MEHDLERGPPGPRRP=PRGPPLSSSLGLALLLLLLALLFWLYIVMSDWTGGALLVLYSFA 5 9 

20 ■ 40 



6 0 LMLII I ILI IFIFRRDLLCPLGALCILLLMSKYYTLCPTPPFPYSSFSNALSPLSPVTLL 119 
60 • 80 100 



120 LIALWNLHGQALFLGIVLFIFGCLL 144 
120 • 140 

% Identity = 3.4 (5/145) % Homology = 0.0 (0/145) % Total = 3.4 (5/145) 

/// 



### DNA Strider 1.4f6 ### Wednesday, July 18, 2007 12:55:30 PM (US Letter @ 100%) 



9310-13DVCTDV SEQ ID NO 6.xprt x Bankier et al. BN-LFla.xprt => Protein Alignment 

Protein sequence 30 aa STAVAQSATPSV . . . LRAATSGATAAA 

Protein sequence 



144 aa MEHDLERGPPGP . . . LGIVLFIFGCLL 



Method: 
Layout : 

Block Length <: 
Mismatch penalty: 
Gap penalty: 
Display: 



Diagonals (BLOSUM62) 

Standard 

6-aa 

Smaller (1) 
Medium ( 2 ) 
BLOSUM62 

20 



Alignment 38 , Comparison of the amino acid sequence of 
SEQ ID NO: 6 with the amino acid sequence, BN-LFla, encoded 
by the nucleotide sequence of Fig. 2 of Bankier et al. 



1 STAVAQS ATPSVSS S I S S LRAATSGATAAA 30 

+ P ++S A 

1 MEHDLERGPPGPRRPPRGPPLSSSLGLALLLLLLALLFWLYIVMSDWTGGALLVLYSFAL 6 0 

20 • 40 60 



61 MLIIIILIIFIFRRDLLCPLGALCILLLMSKYYTLCPTPPFPYSSFSNALSPLSPVTLLL 12 0 

80 • 100 • 120 



121 IALWNLHGQALFLGIVLFIFGCLL 144 

140 

% Identity = 2.1 (3/144) % Homology = 2.1 (3/144) % Total = 4.2 (6/144) 



/// 



